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CONCEPTS  OF  FUZZY  MODEL  ASSESSMENT 


1.  INTRODUCTION 


When  tracking  a  contact  via  passive  sonar,  one  finds  the  solution  structure  of  the  track  is 
in  lar^e  part  ^termin^  by  the  stationarity  assumptions  made  on  the  stochastic  processes 
associai^  with  the  tracking  solution.  Stationarity  is  assumed  for  tte  noise  processes,  the 
kinematics  of  the  two  vessels,  and  the  acoustic  channel  linking  the  source  and  the  observer, 
which  in  our  model  corresponds  to  the  contact  and  own  ship,  respectively.  Essentially,  one  is 
forced  to  assume  the  processes  are  piecewise  station^,  i.e.,  stationary  for  periods  of  time  with 
the  change  points  between  stadona^  periods  occurring  quickly  with  respect  to  the  expected 
len^  of  stationarity.  During  periods  of  stationarity,  a  filter  can  track  tte  contact  Between  the 
periods  of  stationarity,  the  filter  may  lose  track  and  should  in  any  event  be  reinitialized  after  the 
switohin^  pmod  between  stationaiy  periods.  Thus,  one  has  a  renewal  process  with  periods  of 
stationanty  during  which  tracking  is  possible.  The  points  of  renewal  in  this  process  are  the  ends 
of  the  transition  periods  of  stationarity. 

To  sut^sfully  model  this  process,  one  must  not  only  design  the  tracking  filters  for  the 
stationary  periods  but  also  detect  the  change  points  and  determine  the  new  propagation  model 
that  will  hold  during  the  next  period  of  stationarity.  The  essence  of  model  assessment  is  the 
classificatiofi  of  the  possible  propagation  models.  Model  assessment  is  a  pattern  recognition 
problem,  which  can  be  approached  in  many  ways.  Here  a  fuzzy  system  is  used  to  determine  the 
possible  models  and  the  confidence  associated  with  each  model.  From  this  information,  a 
decision  is  made  as  to  which  possible  models  should  be  maintained  in  constructing  the  tracking 
solution  during  the  next  period  of  stationarity.  Ibis  report  describes  the  application  of  fuzzy 
system  model^g  to  propagation  model  assessment,  with  the  emphasis  on  theoretical  issues. 


1.1  DATA  ABSTRACTION  OF  THE  TRACK 

The  piecewise  stationarity  of  the  observation  process  is  reflected  in  the  tracking  solution. 
The  overall  tracking  solution  is  a  string  of  tracks  built  up  over  many  periods  of  stationarity.  A 
trwk  established  on  a  single  period  of  stationarity  is  called  a  segment  For  a  single  contact  Ae 
piecewise  construction  of  the  solution  leads  to  the  hierarchical  representation  of  the  tracking 
solution  as  a  tree.  For  a  set  of  contacts,  the  solution  is  a  forest  of  trees.  Segments  are  classified 
according  to  mechanism  that  caused  the  segment  formation.  The  mechanisms  include  the 
following: 

1.  Change  of  propagation  path  (pp)  (e.g.,  a  change  from  a  direct  path  to  a  bounce  path 
between  source  and  observer  or  vice  versa).  In  fact,  any  change  in  the  propagation  channel  fits 
into  this  category  including  multi-bounce  models. 

2.  Changes  in  the  own  ship  heading,  speed,  or  depth. 

3.  Changes  in  the  base  frequency  of  the  source  (bf),  which  should  produce  minor  changes 
in  propagation  channel  characterization  provided  the  change  in  frequency  is  a  small  percentage 
of  the  base  frequency. 

4.  Changes  in  die  contact  heading,  speed,  or  depth  (h2),  which  are  usually  observable  in 
the  Doppler  and  bearing  measurements. 
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5.  Changes  in  the  system  of  unknown  type  (nl).  Unknown  anomalies  are  a  way  of 
specifying  the  class  of  changes  not  modeled  so  far. 

Segments  not  caused  by  contact  maneuvers  are  used  to  build  up  contact  segments,  or  those  parts 
of  the  track  where  the  contact  has  not  changed  its  kinematics.  A  sequence  of  contact  segments  is 
then  used  to  forms  a  contact  track,  which  is  represented  as  a  tree  in  this  hierarchy.  Hie  set  of  all 
contact  tracks  is  a  collection  of  trees  or  a  forest  of  trees,  which  is  the  contact  situation 
assessment. 

This  solution  construction  is  illustrated  in  figure  1-1,  where  on  the  right  side  of  the  figure 
a  single  physical  contact  track  is  drawn  and  on  the  left  side  is  the  corresponding  data  abstraction 
that  represents  the  ideal  solution  for  this  track.  In  the  physical  track,  the  small  white  circles 
represent  data  points  in  the  x-y  plane. 


Figure  1-1 .  Solution  Hierarchy  Induced  by  the  Piecewise  Stationarity  of  the  System  Processes 

Tracks  formed  during  stationary  periods  are  called  segments  and  drawn  as  short  solid  lines  fitted 
to  the  data  points.  The  linear  fit  is  the  result  of  the  tracking  process  or  state  estimation.  This  fit 
is  sometimes  called  feature  extraction,  data  reduction,  or  data  abstraction.  As  long  as  the  contact 
does  not  maneuver,  the  segments  can  be  viewed  as  data  points  in  a  "super  segment"  called  a 
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contact  segment.  Contact  segments  are  drawn  as  a  bold  dashed  lines  in  the  physical  track  and 
there  are  two  contact  segments  in  figure  1-1  where  the  track  evolves  in  time  from  the  bottom  of 
the  figure  to  the  top.  The  contact  segments  are  concatenated  to  form  a  single  physical  track  in 
this  figure.  Since  &e  physical  track  is  built  up  by  successive  data  abstraction,  a  natural  data 
hierarchy  emerges  as  shown  on  the  left-hand  side  of  figure  1-1.  The  most  basic  data  elements  or 
data  points  form  the  first  level  of  die  hierarchical  structure.  Proceeding  up  the  data  hierarchy, 
groups  of  points  generated  during  track  formation  produce  the  segments,  groups  of  segments 
form  contact  segments,  groups  of  contact  segments  form  contact  tracks,  and  at  the  top  of  the 
hierarchy,  the  set  of  aU  contact  tracks  form  Ae  contact  assessment.  Each  level  of  the  hierarchy 
has  a  corresponding  physical  meaning  shown  by  the  arrows. 

For  the  tracker  to  perform  successfully,  the  system  must  be  able  to  detect  the  change 
points  in  the  stochastic  processes  that  are  affecting  the  system,  and  then  classify  them  so  that  one 
can  reinitialize  the  traclang  Alters  properly.  This  later  task,  if  successful,  allows  a  savings  of 
computational  resources  b^ause  the  alternative  is  to  initialize  a  new  Alter  for  every  possible 
scenario  and  see  which  ones  converge.  Previous  efforts  in  this  field  have  used  Dempster-Shafer 
evidential  reasoning  and  compatibility  maps  that  map  observed  effects  to  possible  causes  and,  at 
the  same  time,  produce  a  consistent  basic  probabilistic  assignment  (bpa)  to  the  possible  causes. 
Reference  1  describes  one  such  effort.  Although  this  subtask  is  critical,  it  is  by  no  means  the 
only  issue  in  designing  a  Fuzzy  Expert  System.  This  report  considers  modeling  issues  relevant 
to  the  Fuzzy  Expert  System  Tracker.  Many  of  these  issues  are  also  relevant  to  the  Contact 
Management  Model  Assessment  (CMMA)  simply  because  the  tracker  includes  the  CMMA 
functionality  within  the  tracking  mechanism.  A  more  complete  description  on  the  CMMA 
system  is  contained  in  reference  2.  This  report  emphasizes  theoretical  issues.  The  fuzzy  CMMA 
system's  description  and  performance  is  a  topic  for  future  work,  once  a  suitable  and  correct  data 
set  is  obtained. 


1.2  CLASSIFICATION  OF  THE  MODEL 

Classifying  the  cause  of  change  points  is  a  pattern  recognition  problem,  which  can  be 
formulated  using  many  different  methods.  Statistical  pattern  recognition  (SPR)  is  one  and  fuzzy 
pattern  recognition  (FPR)  is  another.  Although  there  are  many  different  techniques  and 
approaches  to  pattern  recognition,  they  all  require  (1)  a  correct  model  of  the  problem, 

(2)  extraction  of  a  set  of  features  that  describe  the  differences  in  the  classes,  and  (3)  construction 
of  a  decision  rule  to  map  the  feature  space  to  the  decision  space.  Fukunaga  (reference  3) 
describes  the  pattern  recognition  process  using  the  Aow  diagram  of  Aguie  1-2.  In  this  Aguie,  the 
Arst  three  blocks  represent  the  initial  data  collection,  processing,  and  exploratory  data  analysis. 
The  data  structure  block  consists  of  the  search  for  structure  in  the  data  using  clustering  and 
modeling  techniques  along  with  data  reduction  or  feature  extraction.  Feature  extraction  is  an  art 
requiring  iterative  reAnement  measured  by  the  error  estimates  achieved.  Finally,  the  features  are 
used  as  input  to  the  classiAer  designed  to  At  the  data  for  the  anticipated  use  of  the  system.  The 
Anal  product  needs  to  be  tested  to  validate  the  design  procedure.  Note  the  number  of  loops  and 
the  emphasis  on  nonparametric  methods  to  determine  the  performance  of  the  classiAer,  before  it 
is  built.  The  Aow  diagram  assumes  a  valid  data  set,  which  is  not  available  at  this  point.  So  the 
blocks  that  address  normalization,  error  estimation,  and  actual  clustering  and  modeling  of  Ae 
features  will  have  to  wait  until  valid  simulated  data  are  available  or  real  data  is  found.  So  in 
terms  of  Fukunaga's  Aow  chart,  this  report  can  only  address  the  data  structure  analysis  block  and 
the  classiAer  design  block.  The  approach  is  to  replace  the  classical  decision  rules  and  the  feature 
extraction  by  their  fuzzy  system  counterparts,  which  are  the  fuzzy  rule  base  and  the  fuzzy  term 
sets,  respectively.  This  issue  will  be  discussed  sections  2  and  3. 
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Figure  i-2.  Fukunaga's  Flow  Chart  of  the  Process  of  Classifier  Design 


The  goal  of  pattern  recognition  is  to  design  a  decision  rule  that  partitions  the  feature 
space  into  equivalence  classes,  one  equivalence  class  per  cause  of  change  point  The  extracted 
features  are  usually  represented  as  vectors  in  a  Euclidean  space.  Thus,  the  decision  rule  is  a  map 
from  the  observation  space  to  the  catises  or  to  the  description  of  the  equivalence  classes.  So  the 
decision  fits  the  traditional  representation  of  a  decision  rule  as  a  map  from  the  observation  space 

to  the  set  of  integers  {1.. .  ..c}.  where  each  integer  represents  a  class  {1 . c}).  The 

statistical  decision  rule  is  shown  in  figure  l-3a  as  a  map  that  places  a  unit  value  at  the  "proper” 
class  element  and  zero  values  in  all  the  other  elements.  So  for  this  example,  ^(x)  =  2.  In  this 
figure,  the  one-dimensional  observation  space  is  partitioned  into  five  disjoint  regions,  and  the 
labeled  areas  are  the  characteristic  functions  associated  with  each  of  die  regions.  Clascal  sets 
may  be  defined  in  terms  of  a  characteristic  function  that  takes  on  the  value  1  if  the  point  is  in  the 
set  and  0  if  the  point  is  not  in  the  set.  i.e.. 

_fl.  xeA 
|o.  otherwise 
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For  example,  the  second  equivatence  class  has  a  characteristic  function  denoted  by 


xeibi,b2] 

otherwise 


For  this  example,  the  statistical  decision  rule  is  given  by 


5 

Six)='£iXiix), 

1=1 

where  only  one  of  the  Xi(x)  can  be  non-zero.  In  the  vector  u(x)  =  [Xiix) . X^{x)\,  each 

coordinate  is  associated  with  a  class  and  so  each  decision  is  represented  by  a  unit  vector.  So  the 
vector  of  classes  for  figure  l-3a  is  u(x)  =  [0,1.0,0.0,].  The  fact  that  only  one  element  of  the 
function  can  be  1  also  follows  from  the  definition  of  partition.  A  point  may  be  in  one  and  only 
one  partition  and  one  and  only  one  equivalence  class,  and  thus  in  only  one  class.  This  is  known 
as  a  "hard"  decision  where  the  word  hard  refers  to  the  all  or  nothing  quality  of  the  decision.  In 
figure  l-3a,  the  sample  x  can  belong  only  to  one  class,  class  2.  Conceptually,  the  decision  rule 
Six)  and  the  vector  u(x)  merely  represent  the  decision  in  different  formats.  In  the  latter  format, 
5(x)  can  be  thought  of  as  the  index  of  the  non-zero  element  in  the  vector  u(x),  and  each  element 
of  uix)  can  be  interpreted  to  mean  the  degree  that  the  data  point  belongs  to  the  class  associated 
with  that  index.  In  fact,  the  vector  uix)  represents  the  characteristic  function  of  the  class  chosen 
by  the  decision  rule  Six),  so  in  effect,  the  decision  can  be  thought  of  as  a  map  from  the 
observation  space  to  the  space  of  characteristic  functions  representing  the  classes.  Thus  one 

could  recast  the  decision  rule  as  a  mapping  D:X-^  where  {0,1}  is  the  two-element  set 

containing  1  and  0. 


Figure  l-3a.  Decision  Rule  m  Statistical  Pattern  Recognition 

By  replacing  the  partition  of  the  data  space  by  a  soft  partition,  one  can  construct  soft 
decision  rules.  In  the  previous  paragraph,  a  decision  rule  was  interpreted  as  a  mapping  to  the  set 
of  characteristic  functions.  Characteristic  functions  reflect  the  philosophy  of  classical  set  theory: 
a  point  either  belongs  to  a  set  (takes  value  1)  or  it  does  not  (takes  value  0)  -  there  is  no  in- 
between.  Similarly,  fuzzy  decision  rules  can  be  described  as  mappings  to  the  set  of  membership 
functions  (MFs)  of  fuzzy  sets.  MFs  generalize  characteristic  functions  and  define  fuzzy  sets. 
When  evaluated  at  a  point,  a  MF  determines  the  degree  that  the  point  belongs  to  the  set.  For  a 
fuzzy  set  A,  the  MF  is  denoted  by  //^(x),  and  takes  on  values  in  [0,1].  In  SPR,  a  decision  rule 
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partitions  the  observation  space  as  shown  in  figure  l-3a,  where  each  partition  is  a  subset  of  die 
observation  space  and  that  subset  is  defined  by  a  characteristic  function,  say  X2M  ^  discussed 

above.  In  a  fuzzy  partition,  a  point  in  the  observation  space  can  be  in  any  one  of  the  classes  with 
varying  degrees  of  membership.  The  memberships  are  constrained  by  the  requirement  that  they 
sum  to  one  for  each  point  in  the  observation  space,  i.e., 

c 

^xeX,  = 

i=l 

The  decision  is  no  longer  a  mapping  into  the  set  of  integers  {l,...,c}  but  instead,  into  the  vector 
space  D:X-*  [0,1]^,  just  as  the  hard  decision  rule  may  be  thought  of  as  a  mapping  to  the  vector 

space  D:X-*  {0,1}^,  where  uix)  =  \X\{x),...,Xcix)\  represents  the  map.  By  softening  the  hard 
decision  rule  so  that  one  has  a  fuzzy  partition  of  the  observation  space,  the  decision  rule  is 
interpreted  to  be  a  vector-valued  map  to  a  fuzzy  unit  vector  (fit  vector),  which  is  one 

representation  of  the  MF  of  a  fuzzy  set.  Here,  m(x)  =  [mjC*) . ^  ^  decision  and 

the  fit  vector.  Each  element  of  the  fit  vector  is  the  memtership  of  the  data  point  in  the  class 
denoted  by  die  coordinate  index.  One  class  exists  for  each  dimension  of  the  vector.  In  the 
notation  of  fuzzy  sets,  the  fit  vector  can  be  represented  by  /i(x)  =  /ij(x)  / 1+-  -  /  c,  where 

each  pair  /i/ (x) !  i  is  the  membership  of  the  point  in  class  i:  one  class  for  each  dimension  of  the 
vector.  Because  the  observation  space  partition  is  fuzzy,  the  vector  of  classes  is  no  longer  a  unit 
vector,  but  only  a  vector  whose  elements  sum  to  1. 

Using  the  same  observation  space  of  figure  l-3a,  consider  the  fuzzy  partition  shown  in 
figure  l-3b.  Now  the  classes  are  not  disjoint  and  overlap  exists  between  the  classes.  The 
partition  of  the  observation  space  is  fuzzy  and  is  deBned  by  the  fiizzy  sets,  one  for  each  class.  As 
noted,  the  fuzzy  sets  are  the  generalizations  of  the  characteristic  functions  defining  the  classical 
partition  of  figure  l-3a.  In  a  fuzzy  partition,  a  data  point  represented  by  x  in  this  figure,  may 
support  a  decision  with  varying  degrees  in  class  2  or  class  3  or  both.  ITius,  each  possible 
decision  has  associated  with  it  a  membership  value,  which  represents  the  truth  that  the  data 
supports  that  decision.  And  in  fact,  the  decision  rule  can  be  generalized  further,  so  that  the  sum 
of  the  elements  in  the  vector  of  classes  does  not  have  to  sum  to  1,  but  only  that  each  element  of 
the  vector  is  bounded  above  by  1.  In  summary,  SPR  produces  a  single  unambiguous  decision, 
whereas  FPR  produces  an  MF  that  evaluates  for  each  class  to  the  belief  that  the  data  belong  to 
that  class. 
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13  EXTENSION  OF  THE  CLASSIFICATION  PROCESS 


This  report  takes  the  above  concepts  one  step  further  where  the  decision  rule  maps  from 
the  feiiure  space  to  interval-valued  sets,  one  set  for  each  class.  So  each  element  of  the  class 
vector  is  no  longer  a  real  number  between  0  and  1.  but  instead  a  closed  subset  of  the  interval 
[0.1].  Figure  l-4a  iUustrates  this  decision  rule  for  the  same  observation  space  of  figure  l-3a. 
Classes  2  and  3  have  some  measure  of  support,  but  now  that  support  is  represented  by  an 
interval.  In  general,  all  the  classes  can  have  interval-valued  support  and  die  general  form  of  the 
decision  rule  is  ^{x)  =  M  i^)/  where  A/  is  a  closed  interval.  Each  interval 

represents  the  range  of  truth  the  data  can  support  the  decision  for  that  class,  where  truth  is 
measured  on  a  sciUe  from  zero  to  one.  The  data  are  represented  by  fuzzy  numbers  illustrated  by 
the  shaded  region  in  the  observation  space.  This  vector  of  intervals  is  described  in  the  literature 
as  a  fuzzy  set  of  type  2  and  called  an  interval-valued  fuzzy  set  (reference  4,  p.  14).  The  interval- 
valued  measure  of  truth  is  a  generalization  of  FPR.  Because  the  decision  rule  maps  to  a  closed 
interval,  which  itself  can  be  represented  by  a  characteristic  function  or  an  MF,  one  can  illustrate 
the  decision  rule  as  shown  in  figure  l-4b.  Conceptually,  no  difference  exists  between  figure  l-4a 
and  l-4b;  the  only  change  is  the  representation  of  the  certainty  intervals  yielded  by  the  decision 
rule.  The  decision  system  described  in  this  report  extracts  features  that  are  fuzzy  sets,  uses  a 
collection  of  fuzzy  rules  to  replace  the  decision  rule  of  SPR,  and  uses  interval-valued  fuzzy  sets 
to  represent  the  output  of  the  decision  rule.  Note  that  each  interval  in  this  figure  may  be  the 
result  of  many  rules.  Each  rule  tries  to  assess  the  support  a  feature  gives  to  some  class.  If  an 
unambiguous  decision  is  required,  then  further  processing  is  needed.  The  important  point  is  that 
the  entire  pattern  recognition  process  is  extended  to  produce  not  just  a  single  membership  value 
for  each  class,  but  a  collection  of  MFs  represented  by  interval-valued  fuzzy  sets. 
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In  the  following  sections,  fuzzy  logic  is  applied  to  generate  a  decision  rule  of  the  form 
/i(x)  =  X/ij  (x)  /  !+■  •+Xa^  (x)  /  c  and  then  applied  to  model  assessment  in  contact  tracking.  In 

classical  decision  theory,  the  decision  rule  is  derived  by  optimizing  a  functional  such  as  the 
probability  of  error  or  another  loss  hinction.  In  this  report,  the  decision  rule  is  never  explicitly 
stated  because  the  certainty  intervals  are  the  result  of  a  fuzzy  rule  base  that  models  the  governing 
expert  decision  system.  In  section  2,  fuzzy  systems  are  discussed  in  more  detail  to  explain  how 
the  fuzzy  features  are  extracted  from  the  data  and  processed  by  fuzzy  rules  to  yield  a  decision 
that  is  interval  valued  as  shown  in  figure  1-4.  In  section  3,  interval-valued  fuzzy  logic  is 
described  and  used  to  solve  the  pattern  recognition  problem.  This  discussion  includes  the 
determination  of  the  satisfaction  of  a  fuzzy  premise  by  a  fuzzy  feature,  the  representation  of  the 
strength  of  the  rule,  the  propagation  of  evidence  through  the  rules,  and  the  aggregation  of  support 
for  a  given  conclusion.  In  section  4,  an  exploratory  version  of  a  fuzzy  model  assessment  system 
is  discussed.  Included  in  this  discussion  is  the  construction  of  the  term  sets,  the  rule  syntax  and 
implementation,  the  object-oriented  aspects  of  the  program,  and  examples  of  the  ou^uts.  In  the 
last  section,  the  implications  of  this  work  are  given  and  possible  extensions  suggested. 
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2.  FUZZY  DECISION  SYSTEMS 


Fu:^  decision  systems  use  fuzzy  logic  to  reason  about  decisions.  These  systems 
represent  sigi^  as  fuzzy  numbers  or  linguistic  variables  and  employ  fuzzy  rules  to  process  these 
signals.  In  this  section,  a  cursory  description  of  linguistic  variables  is  presented  and  the  basic 
concepts  of  fiizzy  systems  are  described.  A  simple  control  example  is  given  to  illustrate  how 
fuzzy  sy^ms  process  the  input  signals  to  yield  a  control  signal.  This  sample  also  gives  the 
simplest  illustration  of  a  certainty  measure  associated  with  the  conclusion.  Fuzzy  decision 
systems  are  compared  with  classical  decision  systems  and  a  physical  interpretation  of  the  system 
flow  is  given. 


2.1  EMBEDDING  FUZZY  SYSTEMS 

To  understand  fuzzy  decision  systems,  one  must  Hrst  understand  the  signals  that  flow 
through  them,  i.e.,  the  linguistic  variables.  Formally,  a  linguistic  variable  is  defuied  (reference  5, 
p.  132)  as  a  quintuple  (x,T(x),U,G,M)  where  x  is  the  name  of  the  variable,  Tix)  is  the  term  set 
of  the  variable  x,  U  is  the  domain  of  definition  of  the  variable  x,  G  is  a  rule  that  names  the 

terms  and  ^  is  the  fuzzy  set  that  is  used  to  define  each  term.  All  this  formality  can  be  easily 
explam^  using  an  example.  Consider  the  linguistic  variable  called  COLOR,  where  the 
linguistic  variable  name  is  x  =  COLOR.  Figure  2-1  illustrates  this  linguistic  variable,  where  the 
term  set  T(x)  =  {RED,  YELLOW,  ORANGE,  GREEN,  BLUE,  INDIGO,  VIOLET}.  The  rule 
G  that  generates  the  names  of  the  terms  is  just  the  list  of  basic  colors  but,  in  general,  G  can  be 
a  complicated  grammar.  The  universe  of  discourse  U  or  domain  of  definition  is  the  frequency  of 
the  visible  spectrum.  Finally,  the  meaning  or  semantic  definition  of  the  terms  is  given  in  this 
example  by  a  fuzzy  set,  drawn  as  a  bell-shaped  curve  around  the  central  frequency  associated 
with  the  color. 
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The  shape  of  the  linguistic  terms  is  important  and  are  usually  (tefined  mathematically  by 
the  MFs  such  as  Hn^pix) .  Intuitively,  this  ftmction  finepix)  is  interpreted  as  the  degree  of 
membership  of  the  frequency  x  in  the  set  of  RED  colors,  or  how  "red"  is  the  finequency  x.  The 
range  of  the  function  MkedC^)  ^  where  0  means  no  membership  in  the  set  and  1  means 
total  membership  into  the  set  of  red  colors.  Partial  membership  is  best  illustrated  by  the  example 
of  a  sunset  -  which  is  reddish-orange,  having  membership  in  both  the  RED  and  ORANGE  terms. 
In  figure  2-1.  this  frequency  is  labeled  reddish-orange.  Another  example  in  this  Agure  is  the 
frequency  labeled  chartreuse,  a  frequency  half-way  between  GREEN  and  BLUE.  The  terms  of 
the  linguistic  variables  can  divide  the  space  producing  a  fazzy  partition.  i.e.. 

'^x£X,3at least  one  y  s.L  >  0  and  ~  1  • 

T 

SO  that  at  every  frequency  at  least  one  MF  is  non-zero  and  the  sum  of  the  MFs  at  that  value  of 
frequency  is  1.  Note  the  terms  of  the  linguistic  variable  COLOR  do  not  form  a  fuzzy  partition 
because  by  inspection,  at  the  frequency  located  between  BLUE  and  GREEN,  the  sum  of  the 
membership  values  is  less  than  one.  Fuzzy  partitions  must  then  have  special  term  sets. 

Some  of  the  concepts  of  a  fuzzy  system  for  model  assessment  are  best  described  by  using 
linguistic  variables.  Bezdek  outlines  these  concepts  best  in  his  introductory  article  in  the  ^t 
issue  of  the  IEEE  Transactions  of  Fuzzy  Systems  (reference  6).  Figure  2-2  illustrates  the  fuzzy 
system  model  used  in  this  reiMit  and  reference  6.  In  this  Agure,  the  data  are  Arst  fuzziAed,  i.e., 
transformed  or  mapped  into  linguistic  variables.  The  fuzzy  system  (FS)  processes  these 
linguistic  variables  by  using  rule-based  methods  and  obtains  one  or  more  conclusions.  The 
processing  is  earned  out  by  the  fuzzy  inference  engine  using  the  rules  storcu .  the  fuzzy  rule 
base  where  application-speciAc  knowledge  is  contained  in  the  data  term  sets.  So  the  processing 
solves  the  problem.  However,  the  rules  yield  answers  that  are  in  terms  of  fuzzy  sets  or  in  terms 
of  conclusions  with  associated  certainties.  To  apply  the  answers,  one  must  be  able  to  project 
back  into  the  consol  or  decision  space.  In  a  control  problem,  this  space  is  a  control  value  or 
vector.  In  a  decision  problem,  this  space  is  a  finite  set  of  decisions.  The  process  of  projection  is 
called  defuzziAcation.  The  four-step  process  is  summarized  by  Bezdek  as  "fuzzify,  solve, 
defuzzify,  control"  (reference  6,  p.  3). 
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Figure  2-2.  Fuzzy  System  Model 

This  solution  process  is  similar  to  using  Fourier  transforms  to  solve  a  linear  system.  Here 
one  transforms  the  input  into  the  complex  frequency  domain,  multiplies  the  input  transform  by 
the  system  transform,  thereby  solving  for  the  output  in  the  transformed  space.  However,  to  know 
what  this  output  is,  the  inverse  Fourier  transform  must  be  applied  to  return  to  the  time  domain: 
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transfonn,  solve,  and  take  the  inverse  transfoimation.  Using  transforms  is  similar  to,  but  not  the 
same  as,  the  four-step  process  of  fuzzy  systems.  Fuzzy  systems  are  similar  in  that  one  transforms 
to  another  representation  to  make  the  problem  easier,  and  then  converts  back  from  that 
representation.  However,  the  Fourier  transform  is  invertable  in  some  sense  because  the  inversion 
from  the  transformed  space  totally  recovers  the  signal  for  a  given  class  of  signals. 

The  linguistic  variables  used  in  the  fuzzy  system  are  not  just  another  representation  of  the 
input  They  are  in  a  sense,  an  embedding  of  the  input  into  the  space  of  linguistic  variables.  This 
process  can  represent  numerical  vectors  or  hmctions  without  loss  of  Adelity.  Since  linguistic 
variables  can  represent  data  exactly,  one  could  think  of  these  as  embedding  the  input  into  this 
larger  space.  However,  this  is  not  usually  the  case.  Once  the  data  have  been  fuzziAed,  i.e., 
mtqpped  into  linguistic  terms,  usually  some  information  has  been  lost  so  the  input  cannot  be 
retrieved.  The  important  system  information  has  been  extracted  from  die  input,  and  this 
information  is  then  used  to  solve  the  problem.  Hie  model  trades  representational  complexity 
solution  tractability  and  robusmess. 

Embedding  as  a  solution  technique  has  long  been  used  by  mathematicians.  An  exampi<. 
is  the  evaluation  of  real  integrals  by  complex  analysis.  Here  one  embeds  the  integrand  in  the 
complex  number  system,  and  the  path  of  integration  within  the  complex  plane.  Hie  path  of 
intention  includes  the  real  numter  line,  so  Aat  one  can  apply  the  residue  theorem  in  evaluating 
the  integral  on  a  closed  path,  thus  obtaining  the  real  integral  as  part  of  the  entire  solution.  Here 
the  portion  of  the  integral  along  the  real  number  line  represents  the  solution  and  tl^  process  of 
projecting  back  onto  Ae  real  number  line  becomes  a  triviality.  The  similarity  is  more  evident 
when  the  process  is  described  as  complexify,  solve,  and  decomplexify  (reference  6). 

For  the  system  to  solve  the  problem,  linguistic  variables  are  used  to  fire  fuzzy  rules.  Hie 
rules  are  mappings  from  the  transformed  input  space  of  linguistic  variables  to  the  solution  space 
of  linguistic  variables.  Multiple  levels  of  rules  can  be  thought  of  as  a  composition  of  mappings. 
Unfortunately,  these  mappings  are  multivalued.  So  for  control  problems,  the  rules  can  produce 
ambiguous  results.  In  the  case  of  decisions  problems,  the  rules  can  produce  a  set  of  possible 
decisions.  In  binary  logic,  either  a  rule  fires  or  it  does  not,  depending  on  the  premise  being 
satisfied  or  not  But  in  fuzzy  logic,  all  the  fuzzy  rules  fire  with  varying  levels  of  satisfaction  of 
the  premise  and  with  varying  levels  of  confidence  in  the  implication,  leading  to  conclusions  with 
varying  levels  of  belief.  These  levels  of  satisfaction,  confidence,  and  belief  are  termed 
certainties,  and  the  passage  of  confidence  through  the  rules  is  called  propagation  of  evidence  or 
propagation  of  certainty  through  the  rules. 

With  multiple  answers  at  the  output  of  the  system,  the  control  designer  has  to  talca  the 
multitude  of  answers  and  disambiguate  or  interpolate  between  them  to  obtain  a  single  value. 

This  process  in  FSs  is  called  defuzzification,  which  is  analogous  to  the  projection  back  into  the 
solution  space  when  solving  by  embedding.  In  decisions  systems,  multiple  answers  mean  the 
designer  must  select  a  subset  of  the  solutions  and  use  these  to  further  the  decision  process  in 
some  manner.  Methods  for  handling  ambiguity  in  die  decision  process  is  an  active  area  of 
research. 

The  philosophy  of  a  fuzzy  system  bears  a  resemblance  to  the  linear  systems  concept  of 
resolution  of  signals  into  sinusoids  or  impulse  functions.  In  fuzzy  systems,  the  representation  or 
signal  space  is  now  linguistic,  which  resolves  the  signal  into  membership  in  the  term  sets  of  the 
linguistic  variable.  The  solution  technique  uses  fuzzy  rules  applied  to  the  input  term  sets  and 
yields  an  output  fuzzy  set  or  a  set  of  decisions.  In  linear  systems,  the  signal  resolution  is 
combined  at  the  end  of  the  analysis  using  superposition,  b  fuzzy  systems,  the  output  linguistic 
terms  are  aggregated  in  a  non-lmear  fashion.  Hie  defuzziAcation  of  the  output  term  often 
requires  a  functional  to  map  the  output  to  the  solution  space.  The  strength  of  Aizzy  systems  is 
that  in  some  circumstances,  fuzzy  rules  can  model  nonlinear  systems  easier  than  more 


2-3 


maAematically  piedse  methods.  The  rules  can  be  heuristic  in  nature  and  derived  not  just  frnn 
the  physical  model  of  die  system,  but  from  the  mcp^'  Imowledge  olf  the  behavior  of  the  system 
teained  over  time.  The  learning  need  not  be  from  just  human  expats,  but  can  dso  be  derived 
from  learning  methods  such  as  neural  networks  (NNs). 

Fuz^  systems  are  ^plicable  to  model  assessment  because  humans  can  take  the  sensa 
readings  and,  usinjg  heuristics,  determine  what  type  of  propagation  path  or  model  is  now  valid. 
Ifeuristics  can  be  implonented  as  fuzzy  rules  and  it  is  hop^  that  these  rules  will  be  as  efficient 
and  as  accurate  as  human  observers,  llie  process  is  one  of  pattern  recognition,  i.e.,  given  a  set  of 
tracking  residuals  and  knowing  the  current  trade,  determine  the  new  propagation  mt^. 
Although  this  can  be  modded  as  a  statistical  pattern  recognition  problem,  it  becomes  a  hard 
decision  problon,  i.e.,  only  one  dedsion  is  accepted.  However,  this  really  does  not  make  sense 
from  an  operational  point  of  view.  If  the  data  do  not  support  a  clear  and  obvious  model  for  the 
propagation,  but  instead  a  set  of  possible  models,  then  dl  of  these  models  should  be  pursued  in 
parallel  until  further  evidence  resolves  the  ambiguity.  Clearly,  it  is  more  efficient  to  resolve  this 
ambiguity  as  soon  as  possible,  and  this  is  part  of  the  reason  for  doing  the  work.  However,  if  one 
discard  die  correct  model  by  using  a  hard  classifier,  then  the  contact  can  be  lost  if  the  correct 
model  is  not  used  in  the  tracking  process.  The  ^proach  is  to  make  a  soft  decision,  clearly 
identifying  the  most  promising  model  candidates  with  the  hope  that  tire  true  model  is  one  of  the 
model  candidates. 


2.2  FUZZY  CONTROL  EXAMPLE 

The  four-step  solution  sequence  is  "...fuzzify,  solve,  defiizzify,  control*  (reference  6). 
Fuzzification  is  bas^  on  the  definition  of  the  terms  in  the  linguistic  variables,  ^r  any  data 
value  or  input  x  the  nonzero  terms  t  that  have  support,  i.e.,  {rlreTCx),  >  0}  are  used  to 
code  or  to  fuzzify  the  data  into  the  terms  of  the  linguistic  variable.  Lotfi  Zadeh,  the  father  of 
fuzzy  sets  and  systems  calls  this  "fuzzy  quantification."  So  in  the  example  of  the  sunset,  the 
terms  supported  by  the  reddish-orange  hue  are  (RED,  ORANGE).  For  a  sensor  measuring  some 
quantity  such  as  speed,  the  terms  may  be  (ZHIO,  POSITIVE  SMALL).  It  is  important  to 
remember  that  associated  with  each  value  x  and  with  each  applicable  term  r,  a  memberriiip 
value  exists  describing  the  degree  of  membership  in  this  term.  This  value  is  very  useful  as 
illustrated  in  the  following  control  example. 

A  taxi  driver  who  is  approaching  a  red  light,  must  brake  to  come  to  a  safe  stop.  The 
control  variable  is  the  rate  of  braking  and  the  two  inputs  are  the  speed  of  the  car  and  the  distance 
to  the  stop  light  In  this  simple  example  there  are  only  two  variables  and  two  fuzzy  rules  used  to 
determine  the  braking  rate.  The  two  rules  have  the  following  form: 

RULEl: 

IF  the  speed  of  the  car  is  HIGH, 

AND  the  distance  to  the  stop  light  is  NEAR, 

THEN  the  braking  should  be  I^RD. 

RULE  2: 

IF  the  speed  of  the  car  is  MEDIUM, 

AND  the  distance  to  the  stop  light  is  MEDIUM, 

THEN  the  braking  should  be  MEDIUM. 

Where  the  term  set  for  speed  is  {SMALL,  MEDIUM,  HIGH),  the  term  set  for  the  distance  is 
(NEAR,  MEDIUM,  FAR),and  Ae  term  set  for  the  braking  is  {LIGHT,  MEDIUM,  HARD, 
PANIC).  Both  rules  are  illustrated  in  figure  2-3:  the  first  rule  is  on  top  and  the  second  rule  on 
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the  bottom.  The  terms  are  defined  by  triangular  fuzzy  sets  that  cover  the  three  universes  of 
discourse,  i.e.,  speed,  distance,  and  braking  rate.  The  important  point  is  that  the  fuzzy  rules  are 
approximations  to  an  exact  amdytical  description  of  the  system.  In  this  figure,  the  input  ^leed  x 
membership  in  both  terms  {MEDIUM,  HIGH}  of  the  linguistic  variable  speed.  The  ^gree 
of  membership  is  evaluated  using  these  term  sets.  Similarly,  the  distance  y  to  the  red  light  has 
membership  in  (NEAR,  MEDIUM}.  The  minimum  value  of  the  member^ps  of  these  two 
inputs  is  us^  to  determine  the  "strength"  of  the  conclusion,  which  is  a  term  of  the  linguistic 
variable  called  braking  rate. 


RULE  1.  PREMISE 


,  MP  MEDIUM 


SPEED 


DISTANCE  Y 


BRAKING  Z 


X  SENSOR  INPUTS 


Y 


CONTROL  RULES 
RULE1: 

IF  X  IS  HIGH  AND  Y  IS  NEAR 
THEN  BRAKING  IS  HARD 
RULE  2: 

IF  X  IS  MEDIUM  ANDY  IS  MEDIUM 
THEN  BRAKING  IS  MEDIUM 


0 


RULE  AGGREGATION 

DEFUZZIRED 

RESULT 


BRAKING 


Z 


Figure  2-3.  Typical  Output  Strength  Calculation  in  Fuzzy  Control  Logic 

The  two  braking  rules  yield  different  braking  rate  terms,  which  are  aggregated  to  yield  a 
single  output  braking  rate.  The  aggregation  procedure  is  called  "defuzzification"  and  is  a  simple 
averaging  of  the  areas  in  the  ouq)ut  fuzzy  sets.  Again,  the  relative  strength  of  the  rule  output  is 
determined  by  die  minimum  of  the  memberships  of  the  inputs  in  each  of  the  input  premises.  In 
effecL  the  strength  of  the  output  for  a  rule  is  determined  by  the  degree  of  satisfaction  of  the 
premise  clauses.  Premise  satisfaction  or  certainty  and  its  propagation  through  the  rules  are  an 
inherent  part  of  the  solution  technique.  In  fact,  data  are  evidence  only  if  they  satisfy  the  premise 
as  measured  by  the  certainty.  Propagation  of  the  certainty  through  the  ply  to  determine  the 
certainty  of  the  conclusion  is  the  propagation  of  evidence  through  the  fuzzy  rule.  The  single 
value  obtained  in  the  solution  space  by  defuzzification  is  the  control  part  of  Bezdek's "...  fuzzily, 
solve,  defiizzify,  control." 
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Note  in  this  example,  the  certainty  of  the  premise  or  its  validity  or  its  degree  of 
satisfaction  is  a  single-valued  real  number.  The  strength  of  the  premise  is  equivalent  to  the 
notion  of  the  MF  vsdue  or  validity.  However,  certainty  can  be  represented  in  other  forms  besides 
a  single-valued  real  number  between  zero  and  one,  such  as  interval  valued  and  functional 
valu^.  In  this  report,  the  interval-valued  version  of  the  certainty  is  used,  which  is  more  general 
than  the  single-valued  certainty  illustrated  in  the  control  example.  However,  Bezdek's  model  sdll 
holds.  The  main  change  is  in  the  certainty  representation,  which  will  be  discussed  in  detail  in 
section  3. 


23  SOFT  DECISIONS  VERSUS  HARD  DECISIONS 

In  section  1,  it  was  pointed  out  that  a  hard  decision  produces  a  single  cause  or  conclusion 
and  a  soft  decision  produces  either  a  list  of  possible  conclusions  or  a  list  of  conclusions  with  a 
relative  ranking.  Dempster-Shafer  theory  of  evidential  reasoning  is  an  example  of  soft  decision. 
The  support  for  the  soft  decision  is  on  the  set  of  all  subsets  of  the  possible  conclusions.  So  if  the 
possible  conclusions  are  {hO,  pp,  bf,  h2,  nl),  tlun  the  decisions  of  the  form  ”bf-or-h2"  are 
allowable  decisions.  SPR  pr^uces  a  h^  decision  since  one  is  essentially  doing  an  N-class  test 
of  hypothesis  where  the  decision  is  one  of  N  disjoint  conclusions.  SPR  produces  a  single 
conclusion  after  the  test  is  made.  FPR  can  do  the  same  thing  if  the  decision  process  is  based  on 
Bayes  classifier  as  described  in  reference  7.  Here,  fuzzy  clustering  is  used  to  replace  the 
standard  statistical  estimation  of  centroids  and  dispersion,  and  then  these  estimates  are  used  in  a 
maximum  a  posteriori  decision  rule.  The  answer  is  still  "hard,"  only  one  decision  value. 

Assume,  as  discussed  in  section  1,  that  one  has  a  vector,  u(x)  =  [iq,. . ^uc]  ^  "hard" 
decision  is  represented  as  a  unit  vector  where  only  one  component  has  a  vdue  of  one,  and  aU 
others  have  a  value  of  zero.  A  data  vector  is  represented  by  a  boldfaced  letter,  e.g.,  x^t . 

With  unsupervised  clustering,  one  learns  about  the  structure  of  the  data  without  the  aid  of 
labeled  data  or  as  is  said  in  some  cases  without  a  teacher.  As  the  data  are  clustered,  each  data 

point  x^  has  associated  with  it  a  fit  vector  uix^ ) = [mixk  )] •  representing  the 

memberships  of  x^  in  each  of  the  c  classes.  This  vector  can  be  thought  of  as  a  fuzzy  unit  vector 
or  fit  vector  as  called  by  Kosko  (reference  8)  or  as  a  fuzzy  set  and  represented  by 


pixk )  =  )  /  !+•  •  •+Pc(xk  )/c. 


which  is  the  standard  fuzzy  set  notation.  The  only  requirement  placed  on  the  components  of  this 
vector  is  that  they  sum  to  1,  i.e.. 


(**)  =  !• 


i=l 


As  shown  in  figure  2-4,  this  means  that  for  three  dimensions,  the  vector  is  constrained  to  lie  in 
the  plane  illustrated  by  the  shaded  part  For  hard  classifications,  the  constraint  is  even  more 
severe;  the  membership  vector  or  fit  vector  is  required  to  be  one  of  the  unit  vectors,  also 
illustrated  in  figure  2-4.  An  application  is  the  ftizzy  k-nearest  neighbor  algorithm.  One  finds  the 
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k-nearest  neighbors,  and  then  averages  their  n^nbership  vectors.  The  membership  vector 
indicates  to  what  degree  the  sample  vector  belongs  to  each  class.  If  one  must  make  a  hard 
decision,  then  the  class  widi  the  maximum  component  will  provide  the  hard  decisioiL  lies  can 
be  broken  randomly  or  arbitrarily. 


Figure  2-4.  Hard  Decisions  vs  Soft  Decisions 


The  type  of  classification  or  pattern  recognition  occurring  in  model  assessment  is  even 
mote  general  than  the  fuzzy  neatest  neighbor.  Associated  with  each  class  is  a  closed  interval, 
which  represents  the  lower  and  upp^  bounds  on  the  membership  in  that  class.  So,  the 
member^p  vector  has  now  been  replaced  with  what  is  called  an  interval- valued  fuzzy  set .  Klir 
defines  the  interval- valued  fuzzy  sets  as "...  membership  functions  of  the  form:  P([0,1]) 

where  /i^Cz)  is  a  closed  interval  in  [0,1]  for  each  xeX  ”  (reference  4,  p.  14).  Here  X,  the 
universe  of  discourse  of  the  fuzzy  set,  is  one  of  the  possible  classes,  and  the  interval  associated 
with  each  class  is  the  closed  interval  represerting  the  lower  and  upper  bound  associated  with  the 
certainty  of  membership  of  each  class.  This  situation  is  illustrated  in  figure  2-5,  for  a  five-class 
problem.  The  fuzzy  set  is  given  by 


where  instead  of  single- valued  membership  functions  of  figure  l-3a,  one  has  interval-valued 
membership  functions  as  illustrated  in  figure  1-4.  In  this  fuzzy  set,  the  integers  {l,...,c},  c  =  S 
ate  the  decisions  and  the  univase  of  discourse.  The  support  for  each  decision  is  a  set  represented 
by  die  indicator  function  which  is  1  if  x  €  A  and  0  elsewhere.  For  the  interval-valued 

fuzzy  sets,  the  set  A  has  the  special  form  A  =  [aj«a2] ,  where  is  the  lower  bound  of  the 

support  and  is  the  upper  bound  of  the  support  As  in  FPR,  there  is  no  clear-cut  winner  since 

each  of  the  possible  conclusions  can  have  support  The  process  of  aggregating  this  support  will 
be  discussed  later. 
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Figure  2-5.  Interval-Valued  Fuzsy  Sets 


2.4  CLUSTER  INTERPRETATION  OF  RULES 

Supervised  clustering  occurs  when  data  samples  that  lead  to  a  known  decision  are 
clustered.  In  effect,  one  is  approximating  the  decision  function  by  determining  the  pre-image  of 
a  known  conclusion.  The  dwision  function  then  can  be  specified  by  S(x^)-i<:^  Xk^f  where 
Xj  is  the  i-th  cluster.  Essentially,  these  clusters  are  fuzzy  rules  moping  portions  of  the  input 
space  to  integers  in  the  ou^ut  space.  The  integers  are  the  classes  that  are  referred  to  as  models 
such  as  bf  ,  h2,  pp,  hO  and  ill,  as  well  as  the  indbx  into  the  vector  of  classes.  Kosko  points  out  that 
The  key  idea  is  cluster  equals  rule”  (reference  8,  p.  330).  Although  Kosko  was  referring  to  a 
very  sp^ific  case  of  a  binary  input/output  fuzay  associative  matrix  (BIOFAM),  the  gen^  idea 
holds  here  as  well. 

Figure  2-6  illustrates  the  formation  of  clusters  or  rules  for  model  assessment  In  this 
Hgure,  the  feature  extraction  from  the  data  does  the  same  thing  that  a  Hough  transform  would, 
i.e.,  miyis  lines  to  points.  However,  here  is  a  line  represented  by  a  two-dimensional  point 
(intercept  slope).  The  slope  is  called  the  drift  and  the  intercept  is  called  the  jump  b^use  of  its 
historical  reference  to  the  bearing  tracks.  The  feature  space  in  figure  2-6  is  a  four-dimmsional 
space,  and  the  cluster  are  mapped  to  the  decision  called  h2.  The  point  is,  that  this  cluster 
represents  a  region  of  the  feature  space  that  maps  to  an  integer  in  the  decision  space.  Intuitively, 
these  mappings  are  the  fuzzy  rules. 

In  this  report,  features  are  extracted  from  the  data  to  drive  the  decision  systmn  and  the 
features  are  tten  modeled  as  fuzzy  sets.  Using  the  variance  of  each  feature,  a  normal  densi^  is 
fitted  to  the  feature.  When  renormalized,  this  densi^  is  interpreted  as  a  fii^  number  that  is 
treated  as  input  data.  It  is  this  interpretation  that  allows  the  decision  process  to  be  modeled 
totally  in  terms  of  fuzzy  constructs.  In  fact,  the  entire  system  is  a  fii^  exp^  system  whm  the 
data  objects  are  fuzzy  sets  and  the  infermce  engine  uses  fiizzy  logic.  The  fiiz^  rules  are  viewed 
as  composite  mappings  from  the  input  space  to  the  decision  space.  The  steps  are  summarized  as 
follows: 
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Figure  2~6.  Formadon  of  Clusters  in  Model  Assessment 


1.  The  data  are  fitted  using  linear  regression  techniques. 

2.  The  features  are  extracted  from  the  tinear  fit  and  their  densities  are  re-normalized  to 
construct  fuzzy  numbers.  These  numbers  now  form  the  input  data  to  the  fuzzy  system  whose 
general  structure  was  illustrated  in  figure  2-2. 

3.  The  clauses  of  the  fu^  rules  and  the  fuzzy  partition  of  the  feature  space  are  then 
used  to  construct  closed  certainty  intervals  associated  with  each  clause.  These  intervals  are,  in 
turn,  combined  to  form  a  premise  certainty  interval. 

4.  The  premise  certainty  interval  is  mapped  to  the  output  decision  space  using  the 
strength  of  the  rule  itself. 


Figure  2-7  illustrates  this  mapping  procedure.  Note  that  the  clusters  illustrated  in  figure 
2-7  are  two-dimensional,  but  this  is  not  how  they  are  presently  implemented.  Instead,  they  are 
implemented  as  two  one-dimensional  clusters.  This  figure  is  for  conceptual  purposes  only.  As 
mentioned  in  section  2.2,  the  decisions  are  described  in  terms  of  interval-valued  fuzzy  sets.  So 
conceptually,  the  fuzzy  rule  is  a  mapping  from  regions  in  the  input  space  to  the  ouq>ut  space,  but 
in  this  case,  because  of  the  form  of  the  input  data  and  the  ouq>ut  decision,  the  mapping  is  harder 
to  visualize.  But  conceptually,  Kosko's  claim  remains,  ”... cluster  equals  rule.” 
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Figure  2-7.  Fuzzy  Rule  Maps  to  Interval-Valued  Fuzzy  Set 
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3.  FUZZY  RULES  IN  CLASSIFICATION 


3.1  FUZZY  RULES 

In  section  2,  fiizzy  rules  were  described  as  mappings  from  a  clustm-  or  region  of  the 
feature  space  into  tte  de^on  quce.  Propagation  of  certainty  through  fuzzy  rulu  was  also 
discussed  and  illustrated  in  a  simple  control  example.  The  main  components  of  Bezdek's 
description  of  a  fuzzy  system  were  Ulustrated  in  this  example,  namely, " ...  fuzzify,  solve, 
defiizzify  and  control."  &i  the  taxi  driver  example  of  figure  2-3,  tire  certainty  was  rqnosented  as 
a  teal  number  in  the  interval  [0,1].  In  this  section,  the  model  is  extended  to  iirelude  a  more 
sophisticated  method  of  certainty  representation  -  an  interval-valued  representation.  For  the 
remainder  of  this  report,  only  the  interval-valued  certainty  representation  is  discussed. 
Intuitively,  this  interval  can  be  thought  of  as  lower  and  upper  bounds  on  the  satisfaction  of  an 
event,  premise,  conclusion,  or  fact 


3.1.1  Interval-Valued  Certainties 

First  consider  how  interval-valued  certainty  arises  from  the  interaction  of  the  fuzzy  data 
and  the  fuzzy  rule.  When  the  data  is  itself  uncertain,  the  input  is  modeled  as  a  fupy  number, 
which  is  a  special  type  of  fuzzy  set  A  fuzzy  set  is  represented  by  an  MF,  which  is  a 
generalization  of  the  characteristic  fimction  used  to  express  sets  in  real  atialysis.  So  it  is  a 
mapping  from  the  domain  of  definition  X  to  the  interval  [0,1]  denoted  [0,1].  A 

fuzzy  number  is  a  convex  normalized  fiizzy  set  (reference  4,  p.  17),  normalized  meaning  the  MF 
has  a  maximum  of  1.  or  s  1.  Intuitively,  the  convexity  of  a  fiizzy  set  implies  that  it 

looks  like  a  bell-shaped  or  single-mode  curve.  Formally,  convexity  of  fiizzy  sets  is  defined  in 
terms  of  a-cuts  of  Ae  MF,  which  are  crisp  sets  representing  the  region  of  X  v  ith  membership 
greater  than  or  equal  to  a,  i.e.,  the  crisp  set  defined  by  Aa^{x\  xeX,  ^  A  fiizzy  set 
A  is  convex  if  and  only  if  all  of  its  a-cuts  are  convex.  In  figure  3-1,  both  the  data  and  the 
property  are  convex  sets,  and  this  makes  the  derivation  of  the  certainty  interval  easier. 


Figure  3-1.  Fuzzy  Data  Fitted  to  a  Fuzzy  Prentise 

Certainty  intervals  are  derived  in  CMM  A  from  the  fact  that  the  input  data  are  fiizzy 
numbers,  and  the  properties  defining  the  clauses  are  represmited  as  fiizzy  sets.  Again,  in  figure 
3-2  one  sees  the  property  as  tig(x)  and  the  data  as  except  here  only  one  term  set  is 
shown.  Conceptually,  the  upper  bound  of  the  certainty  interval  represents  the  amount  of  overkqi 
of  the  fiizzy  data  A  and  the  fuzzy  property  B.  In  possibility  theory,  the  standard  intersection  of 
two  fuzzy  sets  is  given  by  the  minimum  function,  i.e.. 
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fiAnBix)  =  niin[/i^(x),/iB(x)l. 
xeX 

and  the  maximum  of  bM  t^prcsents  die  largest  degree  of  overly.  This  minimum  of  two 
fuzzy  sets  is  the  counterpart  to  the  logical  AND.  The  standard  union  of  two  fuzzy  sets  is  given 
by 

MAuB(x)  =  maxtMAM’MBMl 
xeX 

The  union  is  the  counterpart  of  the  logical  OR  and  complementation  is  the  counteipart  of  the 
logical  NOT.  Complementation  is  represented  as /ij(x)  =  l-/i^(x).  Both  of  th^  concepts 

ate  needed  to  construct  the  possibility  and  the  necessity.  The  lower  bound  of  the  certainty 
interval  is  represented  by  the  measure  of  subsethood  of  fuzzy  set  A  in  fuzzy  set  B,  or  the  degree 
that  A  is  a  subset  of  B.  Mathematically,  the  upper  bound  is  defined  as  the  possibility: 

n  =  s^minlUAMfUaMh 

The  lower  bound  is  defined  as  the  necessity: 

N  =  inf  max[l  ~/i^(x),/is(x)]. 

xeX 

The  graphical  construction  for  both  of  these  measures  is  illustrated  in  figure  3-2. 


POSSIBILITY  =  supmin[H^(x),ti b(x)J  NECESSITY  =  inf  max[I-ti 

X  X 


Figure  3-2.  Calculation  of  the  Necessity  and  Possibility  of  A  is  B 

In  this  example,  the  data  are  used  directly  to  calculate  the  bounds.  In  this  model 
assessment  system,  the  fuzzy  data  ate  first  fitted  by  a  piecewise  continuous  approximation  and 
then  the  necessity  and  the  possibility  ate  construct.  This  approximation  increases  the  speed  of 
the  system  by  reducing  the  time  complexity  cf  the  problem.  For  example,  in  figure  3-3a  the 
fuzzy  input  is  first  approximated  with  a  trapezoid,  and  then  the  possibility  calculation  is  based  on 
the  trapezoid.  The  necessity  is  illustrated  in  figure  3-3b.  The  certainty  interval  that  results  is 
given  by  [N,!!]. 
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F^un  3-3a.  Calculation  of  the  Possibility  from  the  Data  for  the  Moderate  Term 


Figure  3-3b.  Calculation  of  the  Necessity  from  the  Data  for  the  Moderate  Term 

Note  that  when  the  data  are  singletons,  then  n  -  P  =  and  the  interval  reduces  to 
the  single-valued  representation  used  in  the  control  i»oblem  example  of  the  taxi  driver.  In  fact, 
the  control  problem  can  be  considered  a  special  case  of  the  decision  problem  where  there  is  only 
one  class;  instead  of  estimating  the  certainty  that  each  class  has  occuned,  one  uses  the  fuzzy 
rules  to  estimate  the  control  parameter  itself.  The  next  section  discusses  the  propagation  of 
evidence  through  fuzzy  rules  when  the  certainty  representation  is  interval  valued. 


3.1.2  Propagation  of  Evidence  Through  Fuzzy  Rules 

In  the  taxi  driver  control  example  of  figure  2-3,  the  propagation  of  evidence  through  the 
fuzzy  rules  was  described  as  using  the  minimum  of  the  MF  values  for  the  two  inputs.  More 
specifically,  the  two  input  variables,  speed  and  distance,  were  the  arguments  to  term  sets  known 
as  HIGH  and  NEAR.  The  term  sets  were  triangular  fuzzy  sets  and  when  evaluated  at  these 
inputs,  the  minimum  membersldp  value  was  u^  to  truncate  the  output  fiizzy  set  for  the  braking 
control  called  HARD.  See  the  first  rule  in  figure  2-3.  In  particular,  the  output  is  the  rate  of 
braking  and  is  given  by  ftaiAKEit.)  =  min(/iH^(z),min[^fl/<w(^)»Mivaiir(>)])  ^  certainty  of 
the  ou^ut  can  be  thought  of  as  min[/t^,^(x),//^£4j;(y)].  This  concept  is  generalized  for  the 
interval-valued  certainty  measures. 

In  this  report,  the  fuzzy  model  assessment  uses  interval-valued  measures  of  certainty,  and 
the  propagation  of  certainty  is  based  upon  t-norms,  s-norms,  and  the  detachmrat  operator.  These 
operations  are  generalizations  of  the  logical  operations  of  conjunction,  disjunction,  and 
implication,  respectively,  and  have  the  same  hok  and feel  as  the  operations  used  to  propagate 
evidence  in  the  control  problem.  The  propagation  of  certainty  and  the  propagation  of  evidence 
phrases  are  used  interchangeably.  In  section  3.1,  the  certainty  of  the  premise  was  derived  using 
the  necessity  and  the  possibility.  When  the  premise  has  multiple  clauses,  the  clause  certainties 
are  aggregated  to  yield  the  premise  certainty.  Intuitively,  aggregation  means  the  combination  of 
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certainties  and  the  mathematical  definitions  are  deferred  until  section  3.3.  In  this  repwt, 
conjunctions  are  allowed  in  the  premise  clauses  but  premise  disjunctions  must  be  expanded  into 
distinct  fiizzy  rules.  For  a  more  complete  description  of  this  q}proach.  Bonissone's  papers 
(references  9-12)  are  recommended  to  the  reader.  For  a  complete  discussion  of  the 
representations  of  certainty,  refer  to  reference  13. 

To  aggregate  the  premise  certainty,  the  t-norm  is  used.  Since  the  triangular  norm  or  t- 
norm  behaves  Hire  a  logical  conjunction,  the  strength  of  the  premise  is  based  on  the  wesJcest 
clause.  The  t-norm  is  a  binary  operator  denoted  by  T(x,y)  where  x,ye[0,l],  and  die  minimum 
function  is  well  known  example.  The  t-norm  is  associative,  T(x,T(y,z))=  r(r(x,y),z), 
commutative  T(x,y)  =  T(y,x),  and  monotonicaUy  nondecreasing  in  both  arguments 
T(x,y)  ^  r(v,w);  if  x^v  and  y^w.  The  t-norm  satisfies  the  boundary  conditions  r(0,0)  =  0 
and  r(x,l)  =  r(l,x)  =  x,  which  also  satisfy  the  definition  of  the  logical  conjunction.  To  t^rply 

the  t-norm  to  the  interval-valued  certainty,  assume  that  the  premise  is  of  the  form  ,  where  Af 

f-i 

is  a  fiizzy  clause  and  [u,,i4,]  are  its  interval-valued  certainties.  Then  the  premise  interval 
certainty  is  given  by 

[b,E\  — 

For  the  special  case  when  Tix.y)  =  min(x,y),  the  premise  certainty  is  given  by  the  minimum  of 
the  certainty  minimums  and  the  minimum  of  tiie  certainty  maximums.  In  general,  the  t-norm  can 
be  designed  to  reflect  the  data  association  or  correlation  of  the  premise  clauses.  For  positively 
correlated  clauses,  the  norm  7'(x,y)  ^  min(x,y)  is  a  good  choice;  for  uncorrelated  clauses  the 
norm  7'(x,y)  =  xy  may  be  a  good  choice;  and  for  negatively  correlated  clauses,  the  bold 
intersection  r(x,y)  =  max(0,x + y  - 1)  may  best  capture  the  association  (reference  10). 

Although  the  work  so  far  has  used  only  T{x,y)  =  min(x,y),  the  mechanism  to  select  the 
appropriate  t-norm  for  each  rule  is  clearly  in  place. 

The  s-norm  or  t-conorm  is  a  generalization  of  the  logical  disjunction  or  logical  OR 
operator.  The  s-norm  S(x,y)  with  x,y£[0,l]  has  similar  properties  to  the  t-norm.  These 
properties  include  associativity  5(x,5(y,z))  =  5(5(x,y),z),  commutativity  S{x,y)  =  5(y,x),  and 
monotonicity  in  both  arguments  5(x,y)  ^  S(y,w)  if  x^v  and  y^w.  The  boundary  conditions 
resemble  the  logical  OR  in  that  5(1,1)  =  1;  5(x,0)  =  5(0,x)  =  x.  The  s-norm  used  in  this  report  is 
the  maximum  function  5(x,y)  =  max(x,y).  The  t-norm  and  s-norm  are  related  by  a  generalized 
version  of  DeMorgan's  law  provided  one  uses  a  hard  complement,  i.e.,  ^(x)  =  1  -  x. 

DeMorgans'  laws  then  become  5(x,y)  =  N(T{Nix),N{y)))  and  r(x,y)  =  A^(5(iV(x),iV(y))). 

The  detachment  operator  is  the  mathematical  mechanism  needed  to  propagate  the  premise 
certainty  through  the  implication  operator  to  the  conclusion.  The  detachment  operator  is  defined 
in  terms  of  the  s-norm  and  t-noim  operators  and  provides  the  mechanism  for  the  propagation  of 
evidence,  not  only  for  the  interval-valued  certainty  representation,  but  also  for  the  single-valued 
certainty  representation.  In  the  taxi  driver  control  problem,  the  output  fuzzy  set  of  figure  2-3  can 
be  written  as  tiagAKEi^)  -  '^lis  output  is  derived  from 

the  following  rule:  if  the  speed  x  is  HIGH  and  the  stop  light  is  NEAR  then  brkke  HARD.  Yet 
there  is  nothmg  in  this  conclusion  that  includes  the  confidence  in  the  rule  itself.  In  general,  if 
one  has  a  rule  P-^Q,  then  if  v(P)  denotes  the  certainty  or  validity  of  the  premise  and 
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v(P  -¥  Q)  denotes  the  strength  or  validity  of  the  forward  implication  operator,  then  the 
detachment  operator  must  relate  these  two  quantities  to  the  conclusion  validity  v{Q).  Mtne 
precisely,  the  binary  detachment  operator  m(  viP),  v(P  Q))  is  defined  so  that  m  is  as  large  as 

possible  and  still  satisfies  the  constraint  m(v(P),  v(P  -4  Q))  ^  v(Q)  (reference  14,  p.  167).  That 
is,  tite  strongest  conclusion  you  can  infer  wiAout  overstating  the  truth  ofAe  conclusion.  The 
validity  or  certainty  can  also  be  thought  of  as  the  truth,  as  well  If  one  assumes  that 
v(P  (2)  =  1  then  the  control  example  is  also  an  application  of  a  detachment  operator.  The 
detachment  operator  m  can  be  taken  to  be  a  t-norm  (reference  14)  and,  in  particular,  the 
minimum  function  min(  v(P),  v(P  Q))  is  the  single-valued  certainty  that  is  oftmi  used  by 

fuzzy  expert  system  shells  lite  fuzzy  logic  official  production  system  (FLOPS)  (reference  IS). 
Just  as  the  detachment  operator  can  propagate  evidence  through  fuzzy  rules  for  single-valued 
certainty  measures,  the  operator  can  also  propagate  evidence  using  interval-valued  certainty 
measures. 

The  detachment  operator  discussed  above  was  a  function  of  two  arguments  since  the 
validity  of  both  the  premise  and  the  rule  were  real  numbers  in  the  interval  [0,1].  Now  the 
premise  certainty  is  given  by  the  interval  [b,P],  and  the  conclusion  certainty  is  represented  by  an 
interval  as  well.  Moreover,  it's  not  clear  how  to  represent  the  certainty  of  the  implication  in  tins 
description.  Bonissone  (reference  10)  uses  the  "necessity"  and  "sufficiency"  of  the  implication 
operator  to  construct  the  certainty  of  the  conclusion.  The  necessity  is  given  by  n  =  viQ  -¥  P) 
and  the  sufficiency  is  given  by  s  =  v(P  ->  Q) .  (Note  this  is  not  the  same  necessity  used  to 
measure  the  subsethood  of  one’fuzzy  set  to  another;  but  instead,  is  the  classical  mathematical 
definition  of  the  necessary  part  of  the  if-and-only-if  logical  implication.)  The  sufficiency  s  is  the 
strength  of  the  implication  in  the  forward  direction  and  the  necessity  n  is  the  sttength  of  the 
implication  in  the  backward  direction.  So  the  detachment  operator  is  a  vector-valued  function  on 
a  vector  field  m((b,B],(s,n])  =  [7'(s,b),S(B,l  -n)]  where  T  is  a  t-norm  and  S  is  its  corresponding 
s-norm.  For  example,  if  T  =  min,  then  5  =  max  and  m([f>,P],[s,n])  =  [min(j,b),max(P,l-/i)), 
which  is  the  detachment  operator  used  in  the  examples  of  section  4. 

One  can  get  a  feel  for  this  operator  by  simply  considering  the  following  four  special 
cases,  where  the  strength  of  the  forward  and  backward  implication  operator  are  set  at  their 
limiting  values: 

1.  [s,n]  =  [0,0]  =»  m  =  [0,1] .  Here  one  cannot  rely  on  the  rule  in  either  direction  so 
that  inference  mechanism  has  no  stren^.  The  certainty  interval  is  appropriately  given  by  [0,1] 
since  the  lower  bound  on  the  certainty  is  zero  and  the  upper  bound  is  1.  'Diis  simply  says, 
nothing  is  known  about  the  certainty  of  the  conclusion. 

2.  [s,/i]  =  [1, 1]  =»  m  =  [b,  P].  Here  die  implication  operator  is  totally  reliable  in  both 
the  forward  and  the  reverse  direction.  The  strength  of  the  conclusion  is  then  bounded  above  and 
below  by  the  corresponding  bounds  of  the  premise,  simply  because  the  implication  operator  has 
infinite  Hdelity. 

3.  [s,n]  =  [l,0]  =>  m  =  [b,l].  Here  the  implication  operator  in  the  forward  direction  is 
absolutely  reliable  but  the  reverse  implication  operator  is  absolutely  unreliable.  The  conclusion 
certainty  interval  is  lower  bounded  by  the  lower  bound  of  the  premise  validity,  but  the  upper 
bound  of  the  conclusion  validity  is  1  since  no  reliability  can  be  placed  on  the  reverse  impUcation 
operator.  Without  the  reverse  implication,  nothing  can  be  inferred  about  the  upper  bound  except 
that  it  is  1. 
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4.  [j.n]s[0.1]  =»  ms[0,B].  Heie  the  forward  implication  operator  is  useless  and  so 
no  lower  bound  can  be  established  on  tte  conclusion  validity.  However,  the  strength  of  the 
reverse  implicatioa  operator  allows  an  upper  bound  on  the  conclusion  validity.  Om  expects  dtis 
in  the  case  of  modus  tollens  or  backward  chaining. 

Note  that  the  premise  certainty  interval  in  conjunction  with  the  forward  ply  determines  the 
strength  of  the  lower  bound,  and  in  conjunction  widr  the  reverse  ply  determines  die  strragth  of 
the  upper  bound. 

However,  the  interval  interpretation  is  not  the  only  one  that  can  be  placed  upon  die 
conclusion  validity.  The  truth  of  tte  conclusion  can  be  thought  of  as  a  linguisdc  variable.  Hgure 
3>4  shows  the  definition  of  the  lingmtic  variable  TRUTH,  hi  this  figure,  one  sees  that  die 
semantic  interpretation  of  the  linguisdc  term  "true"  spears  as  the  ramp  fimction  "absolutdy 
true,"  as  a  Kronecker  delta  fimcdon  located  at  a  truth  value  of  1.  Note  that  the  term  "undecided" 
is  a  constant  fimcdon  shaped  like  a  uniform  distribudon  funcdon,  much  like  case  1  in  the  above 
paragn^h.  In  fact,  the  certainty  intervals  can  be  thought  of  as  fuzzy  sets  (as  illustrated  in  figure 
l-4b  and  thus  seen  to  be  related  to  the  terms  in  the  linguisdc  variable  TRUTH  (reference  S).  In 
fact,  redefining  TRUTH  as  in  figure  3-5,  the  certainty  intervals  can  be  interpreted  as 
approximadons  to  the  terms  of  TRUTH  (reference  13).  Thus  the  interval-valued  certaindes 
approximate  the  funcdonal-valued  linguisdc  terms  of  the  linguisdc  variable  TRUTH. 


Figure  3-4.  Linguistic  Variable  TRUTH 
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32  DECISION  SPACE 

In  section  2,  it  was  pointed  out  that  the  decision  function  of  classical  pattern  recognition 
mapped  the  data  to  singletons  or  single  classes  in  the  decision  space.  FPR  generalizes  classical 
pattern  recognition  by  mapping  to  a  fuzzy  unit  ''rector  (fit  vector),  where  the  elements  of  the 
vectors  are  associated  with  classes  and  the  values  of  tire  elements  represent  the  mmnbership  in 
those  classes.  It  was  also  pointed  out  that  fuz^  model  assessment  generalizes  the  certainty 
representation  so  that  the  dements  in  the  fit  vector  are  now  interval-valued  sets  and  the  fit  vector 
b^mes  an  interval-valued  fuzzy  set  Section  3.1  established  that  one  could  view  these  interval¬ 
valued  fiizzy  sets  as  second-order  fuzzy  sets  whose  members  are  fuzzy  sets  approximating  the 
terms  of  the  linguistic  variable  called  TRUTH.  That  is,  think  of  these  intervals  as 
i4)proximations  to  the  terms  of  tte  linguistic  variable  TRUTH.  One  further  mctension  is  ireeded. 
Instead  of  each  element  of  the  fit  vector  being  associated  with  singletons  in  the  (tecision  ^pace, 
the  dimension  of  diis  vector  is  expanded  to  include  one  element  for  each  member  of  the  power 
set  of  the  classes,  just  as  in  the  Dempster-Shafer  formulation.  That  is,  the  elements  of  tire  fit 

vector  are  members  of  the  power  set  of  {ho,i^,pp,^,n/},  denoted  P{hQ,tf^pp,k2,M]  *=  p.  In 

practice,  fiiz^  model  assessment  mr^)S  to  a  subset  of  P.  So  a  fuzzy  rule  can  lurve  a  concluaon 
{bf,h2}  ,  which  is  interpreted  as  either  a  change  in  base  frequency  or  a  contact  maireuver  has 
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occurred.  Associated  with  this  decision  or  element  of  the  fit  vector  is  a  fuzzy  set  representing  the 
certainty  intoval  of  the  conclusion. 

The  fact  that  the  elements  of  the  decision  space  are  now  members  of  P  is  a  result  of  the 
fuzzy  rule  base.  Thefiizzy  rules  were  fashioned  after  the  compatibility  maps  of  CMMA.  The 
Dempster-Shafer  i4)proach  used  in  CMMA  maps  the  evidence  or  data  to  the  frame  of 
discomment  whose  initial  set  is  the  set  of  all  possible  decisions;  this  means  that  the  basic 

probability  assignment  (bpa)  is  on  P.  The  evidence  or  sensor  readings  are  mapped  to  the 
decision  firame  via  compatibility  mt^s.  Since  different  sets  of  sensors  can  map  to  the  decision 
frame,  die  evidence  is  combined  from  these  different  sources  using  the  Dempster-Shafer  rule  of 
combiiuition  (RCXT).  Compatibility  maps  are  very  convenient  in  that  they  o^y  ask  how  the 
current  inputs  can  support  the  elements  in  the  find  decision  frame  of  discernment  Then  tte 
support  for  each  conclusion  is  aggregated  using  the  ROC,  eliminating  tte  question  of  how  die 
different  sensors  affect  each  other  b^use  this  is  all  taken  care  of  in  the  ROC.  However,  the 
ROC  assumes  that  the  sources  of  information  or  data  sources  are  "independent"  althou^  it  is 
not  clear  what  independence  means.  Fuzzy  rules  also  provide  support  for  the  conclusions  ^m 
the  data,  but  the  support  representation  and  the  propagation  mecha^m  are  quite  different 

The  compatibility  maps  of  CMMA  map  the  bpa  of  the  sensor  frame  to  the  bpa  of  the 
decision  frame.  Compatibility  maps  are  based  on  compatibility  relations,  which  are  defined  on 
the  product  space  of  die  two  frames.  "A  compatibility  relation  simply  describes  wUch  elements 
from  the  two  frames  can  be  true  simultaneoudy”  (reference  16).  A  compatibility  m^  is  defined 
as 


where  ={ai,a2,...,a;,}  and  =  so  support  is  mapped  from  sets  in  ©yj  to 

sets  in  ©^ .  The  support  is  translated  from  frame  to  frame  via  the  summation  of  the  bpa  that 
maps  to  a  specific  set,  i.e., 

)—Bj 

where  mgiBj )  is  the  bpa  assigned  to  set  Bj  in  fiame  B.  From  the  formula  for  the  bpa  in  frame 

B,  it  is  clear  ^at  support  is  mapped  from  frame  A  in  an  additive  manner.  This  is  not  the  case 
with  fuzzy  rules.  Fuzzy  rules  map  support  to  the  same  sets  in  the  decision  frame,  but  in  a 

different  manner  and  in  a  different  form.  The  decision  space  is  the  power  set  P=©^,  and  the 
compatibility  maps  are  the  source  of  the  multivaluedness  of  the  mappings. 

The  fuzzy  rules  reflect  the  multivaluedness  of  the  compatibility  maps  since  they  were 
fashioned  from  Aem.  On  a  more  basic  level,  the  multivaluedness  is  due  to  die  fact  that  single¬ 
sensor,  single-measurement  rules  are  not  specific  enough  to  reduce  the  ambiguity  of  the 
compatibility  map  or  the  fuzzy  rule.  So  ambiguity  is  not  a  flaw,  but  a  natural  result  of  the  partial 
observation  of  the  data.  When  more  specific  rules  are  designed,  the  premise  contains  the 
conjunction  of  many  measurement  conditions  and  produces  more  specific  decisions.  Unlike 
binary  rule-based  systems,  all  the  rules  fire  for  each  data  input,  but  the  strength  of  the  premise 
satisfaction  and  the  strength  of  the  rule  determines  the  strength  of  the  conclusion.  The  next  step 
is  to  aggregate  the  conclusion  support  from  all  rules,  which  is  the  counterpart  to  the  Dempster- 
Shafer  ROC  in  CMMA  evidential  reasoning.  Aggregation  is  discussed  in  die  next  section. 
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33  AGGREGATION  OF  CERTAINTY  FOR  CONCLUSIONS 


Conclusion  certainty  is  combined  using  general  aggregation  operators,  which  behave  as 
averaging  operations  (reference  4,  section  2.6).  Ag^gation  operators  combine  fuzzy  sets  in  a 
q)ecinc  manner  that  can  be  defined  by  a  function  with  the  app^riate  properties.  Tl^  fiinctirm 

is  defined  as  h:[0,l]”  [0,1]  and  when  applied  to  fuzzy  sets  defined  on  a  common  universe  of 

discourse  has  the  form  W)-  "^le  function  h  is  assumed  to  be 

continuous,  symmetric,  and  monotonic  nondecreasing  in  all  its  arguments.  Moreover,  two 
boundary  conditions  must  be  satisfied,  namely:  h(0,0,...,0)  =  0  and  h(l,l,...,l)  - 1,  which 
reflect  the  averaging  nature  of  the  operator.  Many  operators  possess  tfa^  properties,  including 
the  maximum  and  Ae  minimum  operators  representing  the  union  and  intersection  of  the  fuzzy 
sets  in  our  system. 

Aggregation  operators  are  used  in  two  different  ways.  The  first  way  is  to  aggregate  the 
certainty  obtained  from  one  set  of  data  across  tl^  rules.  The  rules  are  not  independent,  Le., 
different  rules  can  map  to  the  same  conclusion  and  the  data  and  the  rules  can  in  some  smise  be 
correlated  or  associate.  Therefore,  there  may  be  many  fuzzy  certainQr  intervals  assodated  with 

one  conclusion  of  P  origiiuiting  from  a  single  data  set  These  certainties  need  to  be  combined  or 
ag^gated  to  form  one  certainty  interval.  The  second  way  to  aggregate  is  across  data  samples. 
With  several  sets  of  samples,  evidence  will  accumulate  at  the  conclusions  in  the  form  of  tlwse 
interval-valued  sets,  and  this  evidence  has  to  be  combined  or  aggregated  across  die  samples  or 
across  time.  The  second  aggregation  method  does  not  have  to  be  the  same  as  the  first  So  fiu*, 
this  latter  aggregation  meth^  has  not  been  tested  simply  because  the  work  has  not  progressed  far 
enough.  The  first  method  has  been  tried  and  the  next  simple  aggregation  procedure  has  been 
tested. 

For  each  element  of  the  decision  space,  there  wUl  be  a  collection  of  intervals  generated  by 
all  the  rule  firings.  Bonissone's  method  of  conclusion  aggregation  is  to  replace  these  intervals 
[Cj.CJ  with  one  interval, 

(c,C]=  [5(c,,C2,...,c,),S(CpC,,...,C,)]  (reference  10). 

When  the  s-norm  operator  is  chosen  as  S(x,y)  =  max(x,y),  this  yields  a  certainty  interval  from 
the  max-of-the-minimums  to  the  max-of-the-maximums  for  the  conclusion  interval.  The  support 
of  the  conclusion  then  is  the  maximum  support  that  any  one  rule  gives  to  the  conclusion; 
likewise,  the  possibility  is  the  maximum  of  all  the  possibilities.  ’Diis  method  of  conclusion 
aggregation  always  yields  a  nonempty  interval  provided  some  rule  has  yielded  a  nonempty 
certainty  interval.  More  formally,  if  the  individual  intervals  are  given  by  /x,(x)  = 
the  aggregation  operator  is  given  by 

m  m 

=  suP/^zCv)  A  Hfrix),  where  //^(x)  =  Q/i.^x)  and  fX^ix)  =  W- 

M  i'k] 

The  union  is  implemented  using  the  maximum  operator  and  the  intersection  implemented  using 
the  minimum  operator.  From  this  definition,  the  aggregation  function  h  is  continuous, 
syinmetric,  and  a  nondecreasing  fimction  of  all  its  arguments.  The  boundary  conditions  are  also 
satisfied.  The  simplicity  of  this  method  makes  it  practical,  requiring  only  minimum  and 
maximum  operations.  Note  further,  that  this  aggregation  is  conservative  in  that  each  lower 
bound  Cj  is  itself  the  minimum  of  the  certainties  of  the  premise  clauses  and  the  certainty  of  the 
forward  ply.  So,  the  aggregation  procedure  provides  a  conservative  necessity  measure.  Observe 
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die  snppMt  that  is  into  the  decision  space  by  the  fiizzy  rules  is  not  additive,  as  itis  widi 

the  Dempster-Shafer  apinoach. 

Other  ahemadves  to  this  aggregation  iKOoedure  exist  Source  consensus  (referenoe  10)  is 
one  more  aggregatitm  mediod.  Here  the  form  of  die  aggregation  is 

{CtC\  *  , 

udiich,  for  the  choice  of  r(x,y) »  min(x,y)  would  yield  the  max-of-die-minimums  and  the  min- 
of-the-maximums.  This  intei^  would  be  ctmsiderably  shorter,  and  in  fKt  could  be  null  even 
udien  each  ctnnponent  interval  is  not  nulL  However,  when  all  the  intervals  supporting  a 
conclusion  overlap,  this  would  ave  a  smaller  intmval.  The  necessity  would  be  conaervadve,  but 
the  possibility  may  not  This  wortened  interval  length  tends  to  indicate  too  mudi  knowledge  of 
the  certainty  of  the  conclusion.  Conflicting  answers  can  yield  an  empty  set  leaving  no  idea  of 
what  type  of  sui^rt  exists  for  the  conclusions  other  dum  they  may  be  cmiflicting.  Bonissone 
points  out  t^  this  is  a  test  for  ccmflicting  evidence. 
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4.  FUZZY  CMMA  SYSTEM 


4.1  DATA  IN  THE  CMMA  SYSTEM 

The  CMMA  system,  as  described  in  section  1,  is  a  system  driven  by  features  extracted 
from  conu^  tracks.  These  features  are  linear  fits  to  data,  Le.,  linear  legressimi  of  mder  (me.  In 
general,  this  fit  could  be  a  fumsy  regression  or  it  could  be  a  robust  ncmlinear  regressitm,  or  just  an 
(Mrdini^  least-squares  regression.  Robust  regression  procedures  that  can  be  applied  to  the 
detectitm  and  ffltering  of  features  in  the  mcxlel  assessment  problem  is  discussed  in  reference  17. 
The  slope  and  intercut  estimates  describe  a  straight  line  and  the  distribution  of  these  paranmters 
is  known,  provided  the  errors  are  additive  normal  random  deviates.  Each  parameter  is  a  statistic, 
and  its  distributicm  is  known  to  be  normal  (refermce  18).  The  probability  density  function  (PDF) 
of  the  derived  parameters  is  used  to  represent  the  information  as  a  fuzzy  number,  merely  by 
renormalizing  the  PDF  so  it  has  a  motfe  of  <me.  Describing  the  parameters  as  fuzzy  numbers 
enables  the  uiK:ertainty  of  the  linear  fit  to  be  propagated  through  the  rule. 

Figure  4-1  illustrates  the  overall  process  of  fiizzifying  the  data  from  the  standpoint  of 
contact  management  This  solution  flow  diagram  is  also  part  of  the  tracking  blackboard 
currently  part  of  an  independent  research  study  (reference  19).  First  the  track  is  re-initialized  by 
detecting  the  start  of  the  new  se^ent  as  discussed  in  section  1.  The  new  segment  is  modeled  by 
a  regressive  fit  and  the  distributicm  of  the  extracted  parameters  is  used  to  form  the  fuzzy  sets 
lateled  A  and  B,  respectively,  in  the  figure.  Figure  4-1  shows  how  the  jump  axes  are  partitioned 
using  trapezoidal  term  sets,  which  results  in  a  waffle-iron  texture  in  the  two-dimensio^  feature 
space.  In  both  dimensions  of  the  jump-drift  space,  the  trapezoids  provide  a  fuzzy  pseudo- 
partition  of  the  space.  In  this  figure,  only  the  jump  feature  is  shown  along  with  the  term  sets 
needed  to  fiizzify  the  data. 


Figure  4-1.  Regression  Fit  of  Input  Data,  Model  Fit  by  Fuzzy  Numbers, 
then  Fuzzify  the  Fuzzy  Numbers  from  the  Fit 


As  a  practical  manner,  the  data  are  not  fitted  directly  to  the  term  sets  of  the  featu^. 
Instead,  the  data  are  Hrst  approximated  by  a  piecewise  continuous  MF  and  then  the  possibility 
and  the  necessity  is  calculated  from  this  MF.  This  calculation  illustrated  in  figure  3-3. 

Figure  3-1  showed  how  a  normal  data  pulse  is  fitted  by  a  trapezoidal  approximation  for  one  of 
the  parameter  values.  This  simplifies  the  code  considerably  and  does  so  with  little  loss  in 
performance  and  with  full  generality  in  Ae  shape  of  the  data.  The  q)proxiniation  MF  is 
represented  and  stored  as  a  sequence  of  linear  line  segments.  The  details  of  this  representation 
are  contained  in  appendix  A,  so  it  suffices  to  say  an  MF  is  represented  as  a  list  of  points, 

where  each  point  is  a  pair  of  the  form  (x,,y,).  The  advantage 

of  this  format  is  simplicity,  low  spatial  complexity,  and  low  time  complexity  of  the  fuzzy  logic 
operators.  The  formation  of  the  possibility  and  the  necessity  was  discussed  conceptually  in 
section  3.  In  practice,  the  possibility  FI  =  ^min[/i^(x),/if  (x)]  is  calculated  by  first  forming 

the  intersection  (minimum  of  the  two  MFs),  and  then  finding  the  supremum  over  the  resulting 
MF  by  simply  looking  for  the  maximum  ordinate  over  the  set  of  the  segment  end-points  (x,  ,y,  ) . 
In  like  manner,  the  necessity  N  =  inf  max[l  -|t^(x),/i.(x)]  is  calculated  by  first  forming  the 

complement  of  the  fuzzy  set  A,  taking  its  union  with  the  fuz^  set  B,  and  then  searching  for  the 
minimum  ordinate  of  the  piecewise  representation  of  this  union.  The  space  and  time  complexity 
of  this  process  is  clearly  linear  with  the  number  of  segments  and  thus  an  efficient  implementation 
of  the  calculation.  In  figure  3-3a,  calculation  of  the  possibility  was  illustrated  once  the  data  had 
been  approximated  by  a  trapezoidal  term  set.  The  necessity  is  illustrated  in  figure  3-3b  for  the 
same  trapezoidal  term  set 

The  exploratory  program  for  the  fuzzy  model  assessment  is  written  in  Common  LISP 
using  the  Common  LISP  Object  System  (CLOS)  for  the  object-oriented  programming  aspect  of 
the  problem.  Object-oriented  programming  attempts  to  encapsulate  the  data  and  the  code 
associated  with  objects  in  the  programs.  Accordingly,  the  objects  in  the  code  correspond  to 
natural  entities  in  the  decision  process.  In  particular,  there  are  objects  for  linguistic  variables,  for 
the  conclusion,  and  for  the  data.  An  example  is  the  linguistic  variable  object  called  "lingvar." 
Instances  of  this  object  are  created  for  each  sensor  value  like  spherical  be^ng.  The  general 
form  of  the  definition  looks  as  follows: 


(defclass  lingvar  () 

( 

(base  :initarg  :base  :initform  1.0  :accessor  base) 

(xlow  rinitarg  :xlow  :initform  -1  :accessor  xlow) 

(xhigh  :initarg  rxhigh  :initform  1  laccessor  xhigh) 

(numterms  rinitarg  :numterms  :initfotm  7  :accessor  numterms) 
(terms  :initarg  :terms  :L  dtform  '(ns  nm  nw  ze  pw  pm  ps) 
raccessor  terms) 

(nsj  linitarg  :nsj  :initform '()  :accessor  nsj) 

(nmj  linitarg  :nmj  :initform '()  raccessor  nmj) 

(nwj  rinitarg  rnwj  rinitform '()  raccessor  nwj) 

(zej  rinitarg  rzej  rinitform  '()  raccessor  zej) 

(pwj  rinitarg  rpwj  rinitform  '()  raccessor  pwj) 

(pmj  rinitarg  rpmj  rinitform  'Q  raccessor  pmj) 

(psj  rinitarg  rpsj  rinitform  '()  raccessor  psj) 

(nsd  rinitarg  rnsd  rinitform '()  raccessor  li^) 

(nmd  rinitarg  rnmd  rinitform '()  raccessor  nmd) 

(nwd  rinitarg  rnwd  rinitform '()  raccessor  nwd) 

(zed  rinitarg  rzed  rinitform '()  raccessor  zed) 

(pwd  rinitarg  rpwd  rinitform '()  raccessor  pwd) 


;;  scaling  term 
;;  lower  bound  of  range 
;;  upper  bound  of  range 
;;  terms  number 

;;  terms  names 
;;  nsj  names  size  of  jump 
;;  nmj  names  size  of  jump 
;;  nwj  names  size  of  jump 
;;  zej  names  size  of  jump 
pwj  names  size  of  jump 
;;  pmj  names  size  of  jump 
;;  psj  names  size  of  jump 
;;  nsd  names  size  of  drift 
;;  nmd  names  size  of  drift 
;;  nwd  names  size  of  drift 
;;  zed  names  size  of  drift 
;;  pwd  names  size  of  drift 
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;;  pmd  names  size  of  drift 
;;  psd  names  size  of  drift 


Oxnd  dnitarg  :pmd  :initfonn  *0  :acoessor  pmd) 

(psd  tinitarg  :ii^  :initform  *0  '.accessor  p^) 

) 

(rdocumentation "  Trapezoidal  representation  of  a  MF  for  sensor  variable.”) 

) 


This  definition  of  the  class  called  "lingvar”  is  used  to  represent  terms  of  the  sensor 
measurements.  It  represents  the  term  sets  for  the  features  extracted  from  the  residual  data  on 
the  measurement  llie  linear  fit  to  the  residuals  produces  two  components,  the  jump  and  the 


drift.  The  syntax  of  the  tenn  sets  is 


if  the  sensor  term  set  is  not  zero;  for  the 


zero  term  set  the  syntax  is  {ze} 


The  term  set  name  grammar  uses  the  n  as  negative,  p 


as  positive.  The  size  is  given  as  w  for  weak,  m  for  moderate,  and  s  for  strong.  The  type  of 


variable  is  j  for  jump  and  d  for  drift.  The  range  of  the  measurement  must  be  bounded, 
which  incurs  no  loss  of  generality.  The  range  is  [xlow,  xhigh]  and  the  total  number  of  t^ms 
needed  to  cover  the  measurement  space  is  given  by  "numterms,”  currently  set  at  seven.  The 
variable  "terms"  is  a  list  of  manes  for  these  terms  without  the  qualification  of  jump  or  drift 
The  important  thing  is  not  the  names  or  the  syntax,  but  the  fact  that  th^  measurement  form  is 
captured  in  the  object  class,  and  each  measurement  has  its  own  instantiation  of  this  class. 
This  class  represents  one  of  the  entities  in  the  fuz^  system  diagram  that  describes  the  term 
sets,  and  is  the  information  that  is  needed  to  fuzzify  the  data  on  input. 


At  present,  only  the  positive  parts  of  the  rules  are  being  used  because  the  data  received 
are  folded,  i.e.,  the  absolute  value  of  the  center  of  the  fuzzy  term  sets  is  used.  This  causes  a 
reduction  in  the  number  of  rules.  Because  the  rules  are  essentially  symmetric,  then  for  testing 
purposes,  the  additional  rules  are  redundant  and  add  little  knowledge  to  the  exploratory 
testing.  Clearly,  these  rules  can  be  added  easily  at  a  later  point  Of  mote  importance,  is  the 
question  of  scaling.  The  data  seen  in  the  CMMA  problem  ranges  over  three  orders  of 
magnitude,  sometimes  more.  Hiis  is  one  reason  that  the  trapezoidal  sets  were  used  instead  of 
triangular  sets.  A  more  appropriate  scaling  might  be  a  logarithmic  scaling  of  the  data; 
however,  the  data  may  be  either  positive  or  negative,  requiring  separate  scaling  of  the  positive 
and  negative  parts  and  an  adjustment  for  the  vdues  around  zero.  Although  this  decision  is  of 
no  consequence  for  this  study,  it  will  be  important  when  running  larger  data  sets  and 
handling  both  the  negative  and  positive  parts  of  the  input  data  so  the  transformed  data  are 
more  u^ormly  distributed  across  a  finite  range.  When  making  this  transformation  however, 
the  normal  shape  of  the  data  wUl  also  be  transformed,  and  how  that  looks  in  the  transformed 
space  must  also  be  determined.  The  next  issue  addressed  is  when  should  the  tnq)ezoidal 
approximation  to  the  data  be  made,  before  or  after  the  transformation?  It  is  probably 
advantageous  to  approximate  first,  then  transform. 


A2  STRUCTURE  OF  THE  FUZZY  RULES 

The  form  of  the  fuzzy  rules  in  the  system  is  similar  to  the  simple  control  rules  that 
were  discussed  in  section  2.  An  example  follows: 


IF  the  drift  in  the  spherical  bearing  is  POSITIVE  WEAK  OR  MODERATE 
AND  the  jump  in  the  spherical  bearing  is  POSmVE  WEAK  OR  MODERATE, 
THEN  a  contact  maneuver  has  occurred. 
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In  this  format,  the  terms  drift  and  jump  are  represented  by  fuzzy  numbers  and  both  the  terms 
and  their  logical  constructs  are  given  by  the  capitalized  letters.  So  that  drift  is  a  fuzzy  number 
and  POSITIVE  WEAK  is  a  term  that  is  then  OR'd  with  the  term  POSITIVE  MODERATE. 
The  necessity  and  possibility  of  the  drift  feature  being  a  member  of  POSITIVE  WEAK  OR 
MODO^TE  are  calculated  and  the  certainty  intmvid  is  formed  for  this  clause.  A  similar 
procedure  is  applied  to  the  second  clause  of  the  premise.  The  certainty  of  the  premise  is 
combined  from  the  certainty  of  the  clauses  using  the  t-norm  operator.  The  certainty  of  the 
conclusion  is  again  an  interval,  which  assumes  that  the  strength  of  the  forward  and  backward 
implication  is  one,  that  is,  the  necessity  =  sufficiency  s  1.  Tb^  variables  are  in  the  system 
and  can  be  changed  for  aU  the  rules  or  the  inference  engine  modified  to  include  the  strengths 
in  the  rules  themselves.  The  actual  rule  itself  is  almost  self-documenting  provided  one  recalls 
that  in  LISP  the  functional  forms  are  in  polish  notation,  i.e.,  (function  argl  arg2 ... ).  The 
cply  function  is  the  detachment  operator,  and  the  sj^ni  function  is  an  accessor  method,  which 
retrieves  the  data  from  the  data  object  called  sensor.  The  t+  function  is  the  fuzzy  OR  of  the 
two  terms  POSITTVE  WEAK  and  POSITIVE  STRONG,  and  the  tisa  function  is  interpreted 
as  is  a  or  is  a  member  of  function.  So  the  second  and  third  rows  of  the  rule  produce  certainty 
intervals,  the  output  of  the  tisa  function.  The  term  ply  is  a  list  that  contains  the  strength  of  the 
implication  in  the  forward  and  reverse  directions.  For  this  report,  ply  has  been  set  to  (1  1)  or 
(forward  backward)  implication  strength.  The  conclusion,  h2,  is  a  contact  maneuver.  Note 
the  t+  and  tisa  function  both  utilize  the  piecewise  continuous  representation  of  the  MFs 
adopted  for  this  system. 


(cply  'RULE_BR3  (c* 

(tisa  (spbrd  sensor)  (t-i-  (pwd  spbr)  (pmd  spbr))) 
(tisa  (spbrj  sensor)  (t+  (pwj  spbr)  (pmj  spbr))) 

)  ply  ■(h2)) 


;;  AND  the  predicates  to  form  premise 
;;  is  bearing  drift  positive  weak  or  mod 
;;  is  beating  jump  positive  weak  or  mod 
;;  THEN  conclude  contact  maneuver 


The  rules  are  kept  in  a  separate  file  and  used  in  a  read-only  manner.  Since  the  form  of  the  rules 
is  also  as  a  LISP  function  call,  to  fire  a  rule,  activate  it  by  a  function  call  using  the  evaluate 
function. 


Some  of  the  rules  presently  have  disjunctive  forms  using  a  function  called  c+,  but  this  is 
allowed  at  this  point  only  because  the  t-norm  is  the  minimum  and  the  s-norm  is  the  maximum.  In 
this  one  special  case,  disjunctions  in  the  premises  are  allowed.  These  rules  are  being  rewritten  as 
separate  rules.  The  code  is  designed  so  diat  any  t-norm  can  be  used  in  place  of  the  min  and  any 
s-norm  can  be  used  in  place  of  the  maximum  with  minor  modifications  of  the  code.  After  all  the 
rules  have  fired,  the  conclusions  can  be  displayed  using  certainty  plots.  Figure  4-2  is  a  certainty 
plot  for  moderately  high  signal-to-noise  (SNR)  test  data  when  a  contact  maneuver  has  occurred. 
Note  that  the  single  hypothesis  h2  has  several  strong  certainty  intervals  supporting  this 
hypothesis.  No  other  single  hypothesis  has  any  support  However,  the  multiple  hypotheses  have 
an  absolute  certainty  consisting  of  the  interval  [1, 1].  These  are  to  single  sensor  measurement 
rules  designed  to  support  the  h2  hypothesis.  Note  further,  all  the  conclusions  having  absolute 
support  contain  the  h2  hypothesis,  which  confirms  consistency  in  the  rule  base  for  this 
hypothesis. 
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Figure  4-2.  Certainty  Plot  of  Data  Designed  to  Test  for  a  Contact  Maneuver,  Hypothesis  k2 


The  corresponding  aggregated  certainty  plot  is  given  in  figure  4-3.  Aggregation  as  it  is 
presendy  set  up  for  testing  purposes  uses  the  more  conservative  form  of  conclusion  aggregation, 
where  if  the  certainty  intervals  for  a  single  conclusion  are  given  by  [CpCJ  then  the  certainty 
intervals  for  the  aggregated  conclusion  is  given  by  [max(c,,...,c„),  max(Q,...,C,)].  Again  the 
code  can  be  easily  generalized  so  the  generalized  s-norms  can  be  used  in  place  of  the  maximum 
function.  This  is  an  ideal  case  with  strong  data  in  all  the  sensor  measurements.  When  the  SNR 
deteriorates,  then  the  certainty  in  the  single  hypothesis  weakens  causing  the  necessity  to 
(tecrease.  So  the  minimum  support  drops  and  the  certainty  intervals  widen.  This  is  illustrated  in 
figure  4-4,  by  considering  a  low  SNR  version  of  figure  4-2.  Note  further  that  the  propagation 
path  certainty  for  the  sin^e  hypothesis  has  also  widened.  This  is  because  the  SNR  of  all  the 
sensor  measurements  was  lowered  in  this  example.  The  rule  that  builds  the  certainty  for  the 
change  of  propagation  path  alternative  requires  that  die  beating  have  zero  jump,  and  as  the 
support  of  the  bearing  information  widens  (higher  uncertainty  in  the  mcact  beanng)  the  data  lends 
support  to  the  zero  jump  clause  causing  the  pp  hypothesis  certainty  increase.  In  effect,  the 
certainty  "leaks”  out  of  the  rule  for  which  it  was  designed  and  "spills"  into  other  rules.  The 
weakening  of  the  minimum  support  for  the  h2  hypothesis  is  caused  by  leakage  and  the  increase 
of  the  support  for  pp  is  caused  by  the  spillage.  Again,  this  is  not  a  flaw  in  the  syston;  instead,  it 
illustrates  how  the  deterioration  of  the  data  uncertainty  must  temper  the  strength  of  tire 
conclusions. 
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Figure  4-3.  Aggregated  Certainty  Plot  fora  Contact  Maneuver,  Hypothesis  hi 
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5.  CONCLUSIONS  AND  FUTURE  DIRECTIONS 


In  this  report,  fuzzy  logic  has  been  applied  to  model  assessment  This  problem  is  a 
pattern  recognition  problem,  which  could  be  modeled  as  statistical  pattern  recognition  or  fuzzy 
pattern  recognition.  The  approach  investigated  uses  a  generalized  form  of  FPR  in  which  the 
conclusions  are  all  possible  to  varying  degrees.  Instead  of  modeling  the  degrees  of  possibility  of 
the  conclusions  as  memberships  or  as  real  numbers,  the  degree  is  modeled  as  an  interval-valued 
set  The  conclusions  are  generated  by  a  fuzzy  rule-base  system  using  fuzzy  data  extracted  from 
the  residuals  of  linear  fits  to  the  sensor  data.  The  fuzzy  rules  map  the  data  into  conclusions  just 
as  discriminant  fimctions  used  in  SPR  except  for  the  format  of  die  conclusion.  Tlie  fuzzy  rules 
represent  heuristics  based  on  physical  laws  of  motion  and  Doppler  shifts.  Hie  system  Im  been 
tested  using  only  a  few  key  cases  to  ensure  that  the  rules  and  the  fuzzification  and  aggregation 
algorithms  worL  The  rule-based  system  works,  although  performance  tests  have  not  bwn  done 
b^use  no  correct  data  set  was  av^able. 

This  work  has  generated  promising  results  and  indicated  many  promising  directions  of 
research.  In  particular,  the  performance  of  the  system  is  dependent  on  the  numter  and  shape  of 
the  terms  in  Ae  linguistic  variables  of  the  extracted  features.  The  design  of  these  terms  is 
dependent  on  the  clustering  or  learning  algorithms  used  to  create  these  term  sets.  There  are 
numerous  methods  for  learning  the  term  sets.  One  promising  technique,  first  suggested  in 
reference.^  13  and  19,  applies  fuzzy  neural  networks  (R^.  These  FNNs  learn  not  only  die  term 
but  also  the  fuzzy  rules  themselves.  A  second  interesting  area  of  study  is  the  relationship 
between  the  fuzzy  logic  approach  and  the  Dempster-Shafer  approach.  Clearly  these  two 
ai/proaches  are  relat^  in  some  sense,  but  the  detail  relationship  is  a  difficult  theoretical  problem. 
The  approximation  of  discrirriinant  functions  by  fuzzy  rules  is  a  theoretical  problem  that  begs  for 
an  answer.  A  third  area  of  research  mentioned  in  section  3  is  the  generalization  of  interval¬ 
valued  certainty  representation  to  term  sets  of  the  linguistic  variable  TRUTH.  Finally,  further 
work  is  needed  on  the  aggregation  of  conclusion  certainty,  not  only  across  rules  but  also  across 
data  samples.  The  directions  for  future  study  require  boA  theoretical  and  simulation  study  of  the 
issues  associated  with  the  design  of  this  modeling  problem. 
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APPENDIX 


CONTINUOUS  LINEAR  PIECEWISE 
REPRESENTATION  OF  FUZZY  SETS 


One  issue  in  implementing  fuz^  logic  and  control  is  the  choice  of  die  term  sets  in  die 
fuzzification  and  defiiz^cation  process.  Conceptually,  this  seems  to  be  a  trivial  issue;  however, 
for  implementing  logical  op^dcMis  and  for  evaluating  ^  inclusion  index,  this  is  an  impoitam 
practi^  issue.  Brute  force  implementadons  such  as  vector  rqnesentadons  of  die  MFs  luve  bodi 
high  space  and  time  ccmiplexity,  which  can  slow  the  logic  and  >uld  erroneous  results.  An 
eloquent  and  efficient  rqiresentadon  of  the  fuzzy  sets  is  imperadve.  One  such  rqiresentadon  is 
discussed  in  this  section,  the  lin^  piecewise  continuous  (LPC)  approximation  of  die  term  sets. 

In  section  2  a  linguistic  variable  was  defined  as  a  quintuple  (x,7(x),f/,G,M)whete  die 
elemoitsare 

1.  X  is  die  name  of  die  variable; 

2.  T(x)  is  the  term  set  of  the  variable  x; 

3.  U  is  the  universe  of  discourse,  or  domain  of  definition  or  base; 

4.  G  is  a  syntactic  rule  for  generating  the  name; 

5.  ilf  is  a  fuzzy  set  rqiresenting  the  meaning  of  T(x),  M  is  called  a  semantic  rule. 

An  example  of  a  linguistic  variable  COLOR  was  displayed  in  figure  2-1.  Conceptually,  the  fuzzy 
sets  that  define  the  terms  in  die  linguistic  variable  are  easy  to  represent  both  madiematiodly  and 
symbolically.  Even  numerically,  Ae  MFs  are  quite  easy  to  evaluate  directly.  In  general,  the  result 
of  logic  operations  on  the  MFs  or  semantic  ru^  of  the  term  sets  is  nor  in  tite  same  class  of 
functions  that  defined  the  original  MFs.  For  example,  the  intersection  of  triangle  functions  is  again 
triangular,  however,  the  union  is  not,  provided  the  intersection  operator  is  the  minimum  function 
and  the  union  operator  is  the  maximum  fimctioiL  For  practicality,  one  wants  the  rqiresentation  for 
the  fuzzy  sets  to  be  closed  under  all  the  logical  operations,  such  as  OR,  AND,  and  NOT.  One  set 
of  functions  that  has  this  property  is  die  piecewi^  constant  functions. 

For  this  qipendix,  the  fuzzy  set  intersection  and  union  will  be  implemented  using  the 
minimum  and  maximum,  respectively.  More  specifically,  =  min[;ry((x),|r^(x)]  and 

”  wax[/tyi(x),iU5(x)],  respectively.  The  fiizzy  set  complement  is  given  by 

/r^(x)  =  1  - /r^Cx).  The  goal  is  closure  under  the  three  logical  operations  {flU,  },  the  symbol 

set  for  { AND,OR,NOT}=L,  the  standard  set  of  logical  operations.  Another  set  of  functions  closed 
under  L  is  the  LPC  approximation  of  the  MFs.  This  repr^ntation  is  efficioit  and  simple:  simple 
since  it  is  just  a  string  of  linear  line  segments  connected  at  the  end  points,  and  efficient  since  the 
space  complexity  is  low  and  the  time  complexity  is  ditecdy  related  to  the  space  complexity.  The 

sequence  of  the  points  is  represented  by  the  list  ((Pq  P|)  (F,  P^)  "'(P*-!  P«)),  where  the  points 

are  themselves  lists  P/  s  (x/  y/).  This  representation  is  somewhat  redundant,  but  simplifies 
coding.  Hgure  A-1  and  figure  A-2  contain  two  examples  of  LPC  membership  functions.  These 
ate  trapezoidal  MFs,  TRAPl  and  TRAP2,  respectively. 
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Figure  A-1.  Trapezoidal  Membership  Funetion,  TRAPI 


Figure  A-2.  Trapezoidal  Membership  Function,  TRAP2 

The  sequence  of  x  values  associated  with  each  LPC  MF,  induces  a  paititi(»i  on  the  d(»nain 
or  universe  of  discourse.  The  partition  can  be  represented  by  the  sequence  of  points, 

{xq,  Xj,-  •  sx,}  representing  intervals  of  the  partition  pven  by  {[xq,  xiUxi,  X2). . . 

To  implement  bin^  logic  operations  between  two  MFs  in  an  efficient  manner,  both  MFs  should 
have  the  same  partition,  including  the  same  domain,  which  means  the  first  and  last  points  of  the 
partition  must  be  identical.  Therefore,  part  of  the  implementation  must  include  a  refinement  of  both 
membership  partitions  so  that  both  MFs  are  defined  on  a  common  partition.  Moreover,  this 
partition  is  still  not  refined  enough  since  it  needs  to  include  all  the  cross-over  points  of  the  two 
MFs.  Figure  A-3  shows  all  the  points  which  must  be  included  in  the  partition,  inclutUng  the  cross¬ 
over  points. 


Figure  AS.  Points  of  the  Universe  of  Discourse 
Where  the  Partition  is  Made 


The  algoiidun  to  cany  out  die  buuuy  operadons  is  given  as  follows: 

1.  Construct  the  sequence  of  points  {xq,  that  represent  the  partition  associated 

with  the  i-th  MF,  i  s  1 , 2.  respecdvdy. 

2.  Refine  the  two  partitions  by  appending  die  two  partititxis  and  then  ordering  the  points  to 
fonnatMig.  Reduce  the  bag  to  a  set  by  renaoving  die  duplicate  points.  The  resulting  set  is  a 
partition  refinement  on  \riiich  both  MFs  are  defbiied. 

3.  Construct  the  list  of  linear  segments  based  on  the  refined  partition  for  both  MFs. 

4.  Find  all  the  crossing  points  located  stricdy  within  an  interval.  Add  these  points  to  the 
refined  partition  and  reconstruct  the  list  of  linear  segments  again  for  both  functions.  Bodi  MFs 
associa^  with  the  logical  operation  now  have  the  same  partition. 

5.  Cany  out  the  binary  logic  fimction  on  each  interval  of  the  partition  cmistructing  the 
linear  piece  on  each  interval  and  concatenating  these  results  to  f(»m  the  resulting  MF. 

6.  Simplify  the  resulting  representation  by  joining  adjacent  intervals  that  represent  the  sanre 
linear  line  segment  That  is,  either  adjacent  constant  line  segments  or  continued  lines  with  the  same 
slope. 

Figure  A-4  illustrates  the  steps  of  the  algorithm  before  collapsing  the  representation.  The  algorithm 
is  conceptually  easy  to  understand  and  moderately  easy  to  implement  in  LISP. 


Figure  A-4a.  Constructing  the  Common  Partition  of  the  Two  Fuzzy  Sets 


Figure  A-4b.  Resulting  Fuzzy  Set  Prior  to  Simplification 
of  the  Partition  for  the  Fuzzy  OR  Operator 
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The  closure  properties  of  this  representation  are  guaranteed  using  the  curators  {n.U,  j. 

The  minimum  operation  is  carried  out  on  each  interval  by  taking  the  linear  segment  lying  below  die 
odier,  the  maximum  is  implemented  by  taking  die  segment  above  the  other,  and  lasdy,  Stc 
cmnplement  is  implement  by  replacuig  y-c(»nponents  widi  their  complernent  1  -  y.  So  (moe  die 
MFs  ha  e  a  common  partition,  the  logicid  operations  are  not  only  trivial  to  implement,  they  are  also 
effident 

In  summary,  the  LPC  rqiresentadon  of  MFs  is  an  efficient  representation  that  includes  the 
tnqiezoidal  and  triangular  MFs.  The  logical  operations  are  exact  up  to  the  round-off  error  effects  of 
die  computer  language  implementation.  Moreover,  the  LPC  presentation  tqiproximates  general 
continuous  MFs  and  allows  effident  implementatioiL  The  LPC  representation  retains  its  closure 
properties  under  unions,  intersections,  and  complements. 
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